Wednesday, November 27, 2013

What is Data Warehousing


Like the normal warehousing terminology data warehousing is the storing of data generated by businesses or computers. Wikipedia defines it as follows:

"A data warehouse or enterprise data warehouse (DW, DWH, or EDW) is a database used for reporting and data analysis. It is a central repository of data which is created by integrating data from one or more disparate sources. Data warehouses store current as well as historical data and are used for creating trending reports for senior management reporting such as annual and quarterly comparisons."

"A data mart is a small data warehouse focused on a specific area of interest. Data warehouses can be subdivided into data marts for improved performance and ease of use within that area. Alternatively, an organization can create one or more data marts as first steps towards a larger and more complex enterprise data warehouse."

Data Warehousing is not just focusing on the storage of data but also the integration, accumulation, distribution and analysis of data. Each of these is a specialization on its own. (Data Integration / ETL , Data Modeling, Data storage / Database, Business Intelligence (Decision Support Systems) & Data Visualization )

The 2 fathers of data warehousing is Bill Inmon & Ralph Kimball.

Ralph KimballBill Inmon
Build business process oriented small data marts which are joined to each other using common dimensions between business process.One centralize data warehouse which will act as a enterprise-wide data warehouse and then build data mart as per need for specific department or process
It is known as bottom-up approachIt is known as top down approach
Data marts should be build on dimensional modelling approachCentral data warehouse to follow ER modelling approach


It is recommended to know how both of them work and design a solution for the business using the one that will suit the needs the best. I find merit in both of these two men philosophies, and used them in many DW solutions. Industry standard models usually follows Inmon's methodology, where Kimball's works best for Business Intelligence reporting.

No comments:

Post a Comment