Wednesday, June 30, 2010

Process of Preparing Dimensional Model

  • Preparing ER Data Model
  • ER à split into discrete business process
  • Model Each Process separately
  • Select Many to many relationship in the ER Model which contains numeric and additive nonkey facts, convert them to fact table
  • De-normalize remaining table in to flat table with single part keys which will join directly to the fact table. These table become dimension table

Dimension Tables

  • It is integral companion to the fact table
  • It contain descriptive information of the business
  • Each dimension is defined by its single primary key
  • It is Entry point of the fact table
  • Dimension attribute deliver robust analytic slicing and dicing capability
  • Dimension implements the UI to the DWH

Bus Architecture:

  • Conformed dimension are the bus architecture linking different data mart
  • It is a dimensional model with common dimension linking to different fact table.
  • This is belongs to fact constellation schema
  • Distributed DWH systems
  • DWH developed in Decentralized way

What is Fact Table?

  • It is a primary table of a dimension model
  • It contains numerical information of the business
  • It contains aggregate values of duplicating records
  • It represents business measure
  • Fact table built with list of dimension along with measures
  • Measurement is taken at the intersection of all the dimension
Types of Facts:
Additive Facts: - Values are summed up combination of all the dimensions
Semi Additive Facts: - It can be summed up only some of the dimension, not with others
Non Additive Facts: - It can not be summed up with any specific dimension
Types of Fact Tables:
Cumulative Fact Table:- It contains mostly additive facts
Snapshot Fact Table:- it includes more semi additive and non additive facts
Factless Fact Table: - It contains no measures or facts, only key relations (FK)

Metadata for Data Warehousing

  • All information about the data warehouse environment
  • It is not actual data
  • It is similar to databank
  • It is similar to Encyclopedia

Sunday, June 27, 2010

Extract-Transformation-Load (ETL):

  • Extraction is an initial step to bring the needed data to staging area for further process
  • Transformation process involves converts the data to accurate, resolving duplicate, dealing with missing elements, eliminates duplicates and assign data warehouse keys, normalize the data structure etc.,
  • Loading takes the form of presenting the dimension table to the bulk loading facilities of each data mart

Data Staging Area:

  • Storage area + ETL
  • It is everything between Transaction and data presentation area
  • It cannot be accessed by the users directly, merely for developer
  • It can be considered as QA and developer environment
  • It is off-limit to business users, it cannot provide any serves to users like presenting and Querying, etc.,

Responsibilities of Data warehouse Manager:-

  • Understand your users by business area & Job responsibility
  • Determine the decisions the business users want to make with the help of DWH
  • Identify the user who really need DWH to support their effective decision
  • Ensure that data is accurate & trusted
  • Ensure that user interface & applications are simple & user friendly
  • Continuously monitor the data for accuracy
  • Find new sources for DWH
  • Publish the data regularly
  • Adapt the Data warehouse depends on new data & new requirement
  • Makes the business users happy by providing proper information for decision making & makes their work easy

Saturday, June 26, 2010

Components of Data warehouse

The following 4 distinct components to be considered as DWH Component

  • Operational Source system
  • Data staging
  • Data presentation
  • Data access tools
Diagram Source: The Data warehouse Toolkit - 2nd Edition By Ralph Kimball

    Purposes of a Data Warehouse

    • Explore our past information for current decision making
    • Slice & dice our data, in the way we required
    • It supports people for more fact based decision making
    • Make the data easy for business people to gather information directly
    • It supports forecasting
    • It makes the information easily accessible
    • It is only business user not developer
    • Data in DWH must be credible, accurate & reliable
    • Adaptive and flexible to change
    • It must provide security to our confidential information.
    • It must support for improved decision making