Operational Data Store: A Perfect Blend of Data and Brewing Science – Chapter 3

The fundamental difference between the Data integration hub (DIH) and an Operational data store (ODS) is the existence of a canonical data model as the central consolidated data layer in the center of the second one. 

ODS provides a single source of truth for real-time operational data, while DIH focuses on efficient and scalable data movement and integration between various systems and data sources. An ODS is designed to support day-to-day operations. At the same time, DIH is used to aggregate and transform data for further processing and analysis in other systems, such as data warehouses. In summary, ODS is a key element of an organization’s data architecture, while DIH is a tool to support data integration.

Canonical data model

I know, I know. I have promised you the lager-wise, but this is important. Recently I had an experience with neglecting in that area so I decided to put this in front of particular types of beer.

Never start without a comprehensive Data model of the main layer of ODS (usually called the L1 layer). Even when you start from some small data domain, you need all the future scope covered by the conceptual level of the data model. The best is if there was an initiative of canonical model design preceding the ODS project.

Zachman’s matrice

Along with the conceptual data model you need design standards, naming rules, abbreviations, a business dictionary, level of normalization, level of convergence, key strategy, table ILM stereotypes, standard attributes, and null values strategy.

Just then you can start to model the area of the first iteration on the logical level. And just then you can start to design your metadata. (I mean metadata, not the metadata management system.)

Having a canonical data model (sometimes they call it a common data model) for your enterprise helps you unify communication all over the company systems, and reduces misunderstanding and shifts of meanings.

It can be used in all interfaces, APIs, and services. It can reduce the number of peer-to-peer interfaces involving canonical message concepts or ODS.

Canonical Data Model: The Core of Operational Data Store (ODS)

A Canonical Data Model (CDM) is a comprehensive and unified data description of an enterprise’s business information, defining a common communication standard for front-end services and back-end systems. The CDM aims to simplify and standardize the data integration process and reduce the need for peer-to-peer communications between different systems.

The CDM is an essential tool in the world of data integration, as it provides a common understanding of the structure and content of data, allowing the data integration process to be streamlined and more efficient. The CDM can also be used to standardize the data view of newly developed systems and interfaces, making the integration of these systems into the existing enterprise architecture much simpler.

One of the key benefits of the CDM is its role in providing a foundation for the core of the Operational Data Store (ODS). The ODS is an essential component of modern data architecture, providing a single source of operational data for the enterprise. By utilizing the CDM as the foundation for the ODS, data can be efficiently and consistently consolidated from multiple sources into a single, unified view. This centralization of data enables organizations to make faster and more informed decisions based on the most up-to-date information.

Additionally, using the CDM as the foundation of the ODS can significantly reduce the data integration process’s complexity and ongoing maintenance of the ODS. This reduction in complexity results in reduced costs improved data quality and a more efficient overall data architecture.

In conclusion, the Canonical Data Model is a powerful tool for enterprise data management and is essential for organizations looking to establish a high-quality, efficient, and scalable data architecture. The use of CDM as the foundation for the core of the Operational Data Store is highly recommended, as it provides a solid foundation for data integration and supports the achievement of a unified view of the enterprise’s data.

Designing a CDM is a complex task that requires a thorough understanding of business data and common data architecture principles. The challenge is to create a CDM that is not influenced by existing legacy systems and proprietary data models but instead focuses on the shape of business data. The designer must strive to create a generic view of the data that is flexible and adaptable to changing business requirements.

Canonical data model and its place in new data architecture development

Generating data models from the canonical one is a crucial step in establishing a metadata-driven architecture for the ODS. By using a generator to automatically transform the canonical data model into other formats such as relational or hierarchical, it ensures consistency and standardization across all systems. This approach also reduces the risk of manual errors and speeds up the implementation process. Furthermore, it allows for updates to the canonical data model to be reflected in all dependent data models, ensuring that all systems remain aligned with the latest business requirements. In short, generating data models from the canonical data model is a key component of a metadata-driven approach that provides a more efficient, flexible, and maintainable solution for the ODS.

Model generators and patterns

Generating hierarchical models from a relational data model can be a complex process, as it requires the transformation of a structured, tabular representation of data into a hierarchical one. This process involves identifying the relationships between different elements of the data and mapping them to a parent-child hierarchy. To achieve this, generators use pattern directives that help to determine the appropriate relationships that play a hierarchical role and avoid cycles. The use of these directives helps to ensure that the generated hierarchical model accurately reflects the underlying relational data model, providing a consistent and accurate representation of the data for use in various applications.

Generating relational models from hierarchical ones is generally easier, as the relational model is better equipped to handle information about all relationships between entities, including non-hierarchical ones. In contrast, the use of references in hierarchical models is often more challenging to manage and visualize.

The Canonical and Logical data models should be designed to be free of specific implementation details, allowing for their flexibility and independence. This can be achieved through the use of generators and design patterns while ensuring that the resulting static metadata retains its implementation independence.

Pattern directives vs. wizards

The process of generating code from logical models can be influenced and made more adaptable by using pattern directives and wizards. These additional instructions, stored within metadata, allow the generator to make decisions about the final generated code, including the creation of DDL statements.

Pattern directives provide the generator with specific instructions on how to translate the logical data model into a physical data model. They define the relationships and constraints between entities, and can also provide information about the desired physical options such as tablespaces, audit columns, and partitions.

In the world of metadata-driven development, the use of pattern directives is often preferred over the use of wizards. The reason for this is that pattern directives provide a more flexible and controlled approach to the implementation of code generation. By utilizing pattern directives, control over the implementation remains within the metadata, allowing for changes in implementation, standards, or requirements to be reflected in the metadata as well. On the other hand, wizards, while providing a user-friendly interface for code generation, may limit the flexibility of the code generation process and limit the ability to adapt to changes in implementation, standards, or requirements.

Path of static metadata

Conclusion

In conclusion, the Canonical data model and static metadata play a crucial role in the design and implementation of an Operational Data Store. Using pattern directives in the generation process helps maintain control and flexibility in the metadata, ensuring adaptability to changing standards and requirements. Implementing the Canonical data model with a focus on common data architecture principles and using generators to transform between relational and hierarchical models can greatly improve the efficiency and effectiveness of the ODS.

OtherS in the series