Looking back on the data projects I have come across over the last 20 years, I see that when observed from a proper distance, the majority of them had the same architecture I refer to as a 3A system. The commonality of 3A systems rests in the phases of data processing. Data always passes through three main layers, where it is also stored, often permanently.
The first layer acquires the data. This layer often serves as an archive for all inputs used to analyze processing and help determine who supplied the data, when and in what quality. It modifies the data for further processing and handling of cases where data fail tests are done at this level.
The second layer is responsible for processing the data. It changes the data structure. The data is integrated and enriched with the computed results and then compared and combined with the existing data. This stage also changes the granularity and creates historical data.
The third level is used primarily to publish results. The data is prepared in a format comprehensible to the intended users. This layer is usually responsible for security, ensuring that each user receives only what he or she is entitled to.
The name 3A systems comes from the three steps of processing: Acquire Data, Adopt Data and make Data Accessible.
All 3A systems have two important distinguishing features. The static data architecture is complemented by the dynamic part, which describes methods of data acquisition, transfers of data between layers, the processing of data, and the way results are passed to users.
3A systems always contain a scheduler, providing a uniform way to manage processes. This is an area where the individual systems differ substantially, particularly in the degree of automation and the range of user interventions during processing.
The last distinguishing feature of 3A systems is their control. Each 3A system has a clearly defined control organization structure or a person who defines system outputs, controls the process, and is responsible for the data quality of outputs.
If you say it resembles the three-layer architecture of data warehouses, you are right. A data warehouse is a typical example of a 3A system. But there are many others:
- Reporting systems (systems for regulatory reporting, reporting for management, …)
- Departmental analytical systems (for marketing, security, controlling…)
- Ad-hoc solutions for data science or one-time analysis
- Micro-systems such as the extension of ERP by an application built on Excel, Access, PowerBuilder, or another tool that pulls data from the system, allow it to be processed by the user and then returns the data
- Data migration projects where data are extracted from the old system, transformed and uploaded to the new system multiple times during the development phase.
- Any complex use of Excel has the characteristics of a 3A system
- Archival systems and system decommission projects
- Data marts in a data warehouse, especially if the processing runs on a different time schedule than the data warehouse itself
- Segments within the data warehouse that are controlled and operated by a specific user. They often have the characteristics of a 3A system despite the fact that their results are part of the data warehouse. (MIS solutions, computation of profitability, etc.)
But why do we need a definition of 3A system? Because with all 3A systems the same problems need to be addressed and the same decisions need to be made. Getting to know them before getting deep into system planning and development can help us avoid serious mistakes and reach our objectives. In the next part, 3A Systems – 42 Thoughts” I offer a list of key questions to be answered by anyone who plans to build a 3A system, no matter its scope, complexity, and lifespan.