wiki-data-vault-modeling

https://en.wikipedia.org/wiki/Data_vault_modeling

High-level takeaways

Details

wiki-data-vault-modeling#good-vs-bad-data1 2Data vault modeling makes no distinction between good and bad data ("bad" meaning not conforming to business rules). wiki-data-vault-modeling#good-vs-bad-data1 2

Data vault is designed to enable parallel loading as much as possible,[4] so that very large implementations can scale out without the need for major redesign.

wiki-data-vault-modeling#conformed-dimensions1For conformed dimensions you also have to cleanse data (to conform it) and this is undesirable in a number of cases since this inevitably will lose information. wiki-data-vault-modeling#conformed-dimensions1

wiki-data-vault-modeling#data-vault-model1The Data Vault Model is a detail oriented, historical tracking and uniquely linked set of normalized tables that support one or more functional areas of business. It is a hybrid approach encompassing the best of breed between 3rd normal form (3NF) and star schema. The design is flexible, scalable, consistent and adaptable to the needs of the enterprise[5] wiki-data-vault-modeling#data-vault-model1

wiki-data-vault-modeling#traceability-and-auditability1 2 3 4the focus of any data vault implementation is complete traceability and auditability of all information. wiki-data-vault-modeling#traceability-and-auditability1 2 3 4

wiki-data-vault-modeling#business-keys1The data vault method has as one of its main axioms that real business keys only change when the business changes and are therefore the most stable elements from which to derive the structure of a historical database. If you use these keys as the backbone of a data warehouse, you can organize the rest of the data around them. wiki-data-vault-modeling#business-keys1

wiki-data-vault-modeling#surrogate-key-vs-business-key1a surrogate key, used to connect the other structures to this table. a business key, the driver for this hub. The business key can consist of multiple fields. wiki-data-vault-modeling#surrogate-key-vs-business-key1

wiki-data-vault-modeling#splitting-based-on-rate-of-change1 2 3Usually the attributes are grouped in satellites by source system. However, descriptive attributes such as size, cost, speed, amount or color can change at different rates, so you can also split these attributes up in different satellites based on their rate of change wiki-data-vault-modeling#splitting-based-on-rate-of-change1 2 3

wiki-data-vault-modeling#never-deleted1 2Data are never deleted from the data vault, unless you have a technical error while loading data. wiki-data-vault-modeling#never-deleted1 2

wiki-data-vault-modeling#store-data-not-query-optimized1 2The data vault modelled layer is normally used to store data. It is not optimized for query performance, nor is it easy to query by the well-known query tools. wiki-data-vault-modeling#store-data-not-query-optimized1 2

wiki-data-vault-modeling#easy-to-move1Note that while it is relatively straightforward to move data from a data vault model to a (cleansed) dimensional model, the reverse is not as easy. wiki-data-vault-modeling#easy-to-move1

Referring Pages

data-architecture-glossary