Data marts are still the best way to add definition to data and prepare it for efficient and high quality reporting. Developing a data mart or data warehouse offers a number of benefits:
- Improved performance of transactional systems
- Integrating information across business process silos
- Isolation from changing transactional systems
- Better analytics performance
- Retain the history that source systems cannot
- Capturing state changes
- One source of the truth
- One source for complex calculations
- Slowly changing dimensions
- Integrating hierarchies
- Simpler, less expensive report and analytics development
Improved performance of transactional systems
A data mart allows reports and analysis to take place outside of the transactional source system(s.) This can be critical in preventing a slow-running query or report from taking up all of the horsepower of the transactional system and resulting in delays in entering new transactions and conducting routine business.
Integrating information across business process silos
In many organizations different business processes may have distinct software modules or even distinct systems. For example, there may be a Customer Relationship Management (CRM) module used by Sales to track prospects, a work order system used by Operations to fill orders and an invoicing module used by Finance.
A data mart provides a way to integrate this data and thereby integrate different business processes. An example is being able to report across modules to see not just the sales that were made but the financial profitability of each of those sales.
Isolation from changing transactional systems
Over time the transactional systems used by companies undergo change. A new financial system may be implemented or a significant upgrade to a manufacturing system may be installed. When these changes occur the organization may temporarily lose visibility of business performance as new reports are developed and tested. A data mart provides a layer of isolation and translation so that data from the new system can be integrated into the data mart and emerge in the same reports and analysis provided before the new system was put in place.
Better analytics performance
Because a data mart uses a data modeling technique called Dimensional Modeling it provides a high performance platform for analytics. We’ve frequently seen that dimensionally modeled data marts can provide queries and reports that run one or two orders of magnitude (10x or 100x) faster than transactional source data. This performance becomes more critical as the volume of data increases. Data Marts can utilize powerful techniques such as materialized views and partitioned tables to provide performance over very large sets of data. It is also the case that most modern business intelligence applications are designed to expect a dimensionally modeled data source. New database technologies such as columnar stores are also dimensionally focused.
Retain the history that source systems cannot
Transactional source systems frequently use closing or archiving processes to summarize or reduce the amount of detail they store in their databases. This can result in an inability to provide historical reporting or analysis of past data. A data mart allows this data to be captured at a detailed level in the data mart for historical analysis while allowing the data to be archived or purged from the transactional source system for better performance. We see this often with financial systems where transaction detail is kept in subsidiary ledgers and only summarized batches are recorded in the General Ledger. When the subsidiary ledgers are archived that history is no longer easily accessible in the source system.
Capturing state changes
Often source systems track on the current state of business entities. For example, an ERP system may track only the current state of something. For example, a sales order may be Placed, In-Process or Complete. A data mart can retain each of the state transitions and when they occurred so that processes can be examined and improved.
One source of the truth
A data mart can combine data in a consistent and clearly modeled manner so that there is one authoritative source of the integrated truth. The complexities of the data models that underlie transactional source systems can make combining data complex. For example, one system may track sales at the Product level while another records marketing spend at the Product Group level. The Dimensional Modeling used in a data mart resolves these complexities. It also flattens many to many relationships to ensure that all reports and analysis provided from the data mart provide a single authoritative source of accurate information.
One source for complex calculations
Rather than each person who performs an analysis creating their own calculations commonly used calculations can be defined once in the data mart and reused everywhere for consistency. A simple example is weighting sales to forecast a value for the end of the current month.
Slowly changing dimensions
When information changes in transactional systems the prior state of that information is frequently lost. A data mart can retain this prior state through a concept called Slowly Changing Dimensions (SCD.) An example is a salesperson that is reassigned from the North Region to the South Region. In most transactional systems such a re-assignment means that the association between the sales of that salesperson and the North Region is lost. A data mart can allow sales they made while assigned to the North region to still be associated with that salesperson.
Often source systems don’t provide sufficient flexibility in grouping things for analysis. Data marts allow specific grouping hierarchies to be defined and maintained by the business and then integrated into the data mart for analysis. An example is when a business wants to categorize products by multiple different hierarchies. Perhaps you’d like to categorize all your products by a standard breakdown of price such as Low, Medium and High. In addition a business might want to categorize all the products by the complexity of their assembly. Once these hierarchies and categories are created in the data mart all products are automatically categorized consistently and can be analyzed with these new hierarchies.
Simpler, less expensive report and analysis development
A dimensional data mart makes report development easier and requires less data modeling skills on the part of the developer. It does this in a number of ways:
- Eliminating outer joins so that developers don’t have to worry about double-counting.
- Eliminating optional (null) columns so that developers don’t have to explicitly test for null values.
- Using fact and dimension tables so that all measures (dollars, units, etc.) are always fully additive with no special logic required to prevent over or under counting.
- Using conformed dimension tables so that descriptive attributes like customer name, product description, etc. are stored in a consistent manner in a single table (even when multiple source systems are used.)
- Incorporating complex calculations and time state rollups into the data mart so that each developer doesn’t have to do this on their own. An example might be calculating Weeks of Supply for a retailer or a 30 day average balance for a financial institution.
All of these improvements mean that staff with a narrower range of skills can be effective as report and analytics developers.