Wednesday, December 29, 2010

Data modelling Vs Dimensional modelling

Data modelling:
Modelling data storage optimized for Transactional Processing is Data Modelling

Dimensional modelling:
Modelling data storage optimized for Analytical Processing is Dimension Modelling..

SSAS: What is Named Query?

1. Named Query is a SQL Statements and it act as a another table

2. It is formed by selecting rows and columns from single tables or relating multiple tables

3. It helps to make the data source view with new table structure without modifying the underlying data source

SSAS: What is Linked Dimension ?

In SQL Server Analysis Services (SSAS), Linked Dimension is a dimension which is existed in other analysis services DB. It may exist in same or different server. it helps the user to keep the common dimension in multiple database.

Tuesday, December 28, 2010

What is Conformed dimension?

Conformed dimension is a dimension that has a single meaning or content throught datawarehouse.we can use it accross the fact tables


Example: Time, Product, Staff Dimensions are Conformed dimension, it is shared across multiple fact tables.


What is Role Playing dimension?

Role Playing dimension is a dimension, it can play different roles in a fact table depending on the context.

For example: Date dimension

Date dimension can be used for ordered date, shipping date, shipment date, and invoice date, etc., in an order line fact.

In the data warehouse, you will have a single dimension table for the dates. You will have multiple warehouse foreign key from the fact table to teh same dimension.

What is junk dimension?

A junk dimension is a collection of random transcational codes flags and text attributes that are unrelated to any particular dimension.The junk dimension is simply a structure that provides the convienent place to store the junk dimension.

Monday, November 29, 2010

Operational Data Store Vs Data warehouse

Operational Data Store(ODS)

Data warehouse

ODS is an Online integration of data

Typically it won't provide online Information.  It provides offline  data.

ODS is used, when a system's data is required to access from some other systems immediately. Ex: records updated from various section, data is lying in different DB, of Hospital is required while billing or some other activities

It is used while making decisions from past and present data(bit old, may be data taken from hours or days back)

ODS is centralized repository of data for online operational use

This is Non-Operational use


What is Data warehouse?

Data warehouse is repository of data, which are collected from many sources like transactional database or old backup data, etc., and then filtered, cleansed, transformed the data for the purpose of analyzing, decision making and reporting.

Monday, November 22, 2010

Types Of Dimensions:

Dimensions are classified as followings:


1. Slowly Changing Dimensions or SCD (Type0,Type1,Type2,Type3,Type4,Type6)
2. Slowly Growing Dimensions or SGD (Type1,Type2,Type3 from SCD)


And Other Types are...


1. Conformed Dimension
2. Junk Dimension
3. Degenerated Dimension
4. Role Playing Dimension

Monday, July 5, 2010

Conceptual Data Model or Conceptual Data Schema:-

Conceptual Data Model Identifies the highest level of relationship between important entities of Data Model. It contains only important entities, no attribute, and no key specification. It contains only entity name and entity relationship as shown below

Friday, July 2, 2010

Degenerate Dimension (DD)

It is a dimension which is derived from fact table. DD doesn’t have separate dimension table. It is mostly used when we planned use transactional level information on the fact table. The main purpose of DD is to provide the info directly without referring other table.

Primary key of a fact table is subset of table's foreign key. In some cases, we do not need the foreign key in the fact guarantee the uniqueness, in that scenario, DD plays the role.


Dimension keys without corresponding dimension tables is known as Degenerated Dimensions

Wednesday, June 30, 2010

Process of Preparing Dimensional Model

  • Preparing ER Data Model
  • ER à split into discrete business process
  • Model Each Process separately
  • Select Many to many relationship in the ER Model which contains numeric and additive nonkey facts, convert them to fact table
  • De-normalize remaining table in to flat table with single part keys which will join directly to the fact table. These table become dimension table

Dimension Tables

  • It is integral companion to the fact table
  • It contain descriptive information of the business
  • Each dimension is defined by its single primary key
  • It is Entry point of the fact table
  • Dimension attribute deliver robust analytic slicing and dicing capability
  • Dimension implements the UI to the DWH

Bus Architecture:

  • Conformed dimension are the bus architecture linking different data mart
  • It is a dimensional model with common dimension linking to different fact table.
  • This is belongs to fact constellation schema
  • Distributed DWH systems
  • DWH developed in Decentralized way

What is Fact Table?

  • It is a primary table of a dimension model
  • It contains numerical information of the business
  • It contains aggregate values of duplicating records
  • It represents business measure
  • Fact table built with list of dimension along with measures
  • Measurement is taken at the intersection of all the dimension
Types of Facts:
Additive Facts: - Values are summed up combination of all the dimensions
Semi Additive Facts: - It can be summed up only some of the dimension, not with others
Non Additive Facts: - It can not be summed up with any specific dimension
Types of Fact Tables:
Cumulative Fact Table:- It contains mostly additive facts
Snapshot Fact Table:- it includes more semi additive and non additive facts
Factless Fact Table: - It contains no measures or facts, only key relations (FK)

Metadata for Data Warehousing

  • All information about the data warehouse environment
  • It is not actual data
  • It is similar to databank
  • It is similar to Encyclopedia

Sunday, June 27, 2010

Extract-Transformation-Load (ETL):

  • Extraction is an initial step to bring the needed data to staging area for further process
  • Transformation process involves converts the data to accurate, resolving duplicate, dealing with missing elements, eliminates duplicates and assign data warehouse keys, normalize the data structure etc.,
  • Loading takes the form of presenting the dimension table to the bulk loading facilities of each data mart

Data Staging Area:

  • Storage area + ETL
  • It is everything between Transaction and data presentation area
  • It cannot be accessed by the users directly, merely for developer
  • It can be considered as QA and developer environment
  • It is off-limit to business users, it cannot provide any serves to users like presenting and Querying, etc.,

Responsibilities of Data warehouse Manager:-

  • Understand your users by business area & Job responsibility
  • Determine the decisions the business users want to make with the help of DWH
  • Identify the user who really need DWH to support their effective decision
  • Ensure that data is accurate & trusted
  • Ensure that user interface & applications are simple & user friendly
  • Continuously monitor the data for accuracy
  • Find new sources for DWH
  • Publish the data regularly
  • Adapt the Data warehouse depends on new data & new requirement
  • Makes the business users happy by providing proper information for decision making & makes their work easy

Saturday, June 26, 2010

Components of Data warehouse

The following 4 distinct components to be considered as DWH Component

  • Operational Source system
  • Data staging
  • Data presentation
  • Data access tools
Diagram Source: The Data warehouse Toolkit - 2nd Edition By Ralph Kimball

    Purposes of a Data Warehouse

    • Explore our past information for current decision making
    • Slice & dice our data, in the way we required
    • It supports people for more fact based decision making
    • Make the data easy for business people to gather information directly
    • It supports forecasting
    • It makes the information easily accessible
    • It is only business user not developer
    • Data in DWH must be credible, accurate & reliable
    • Adaptive and flexible to change
    • It must provide security to our confidential information.
    • It must support for improved decision making