A Normalized Data Modeling Approach
The snowflake schema is an extension of the star schema and is widely used in data warehousing and business intelligence. While the star schema is simple and denormalized, the snowflake schema introduces normalization by breaking down dimension tables into smaller, related tables. This results in a structure that resembles a snowflake , hence the name.
Fact Table:
Example:
Fact_Sales
: Txn_ID
, Date_ID
, Product_ID
, Customer_ID
, Store_ID
, Quantity_Sold
, Total_Amount
.Dimension Tables:
Examples:
Dim_Date
: Date_ID
, Date
, Month
, Quarter
, Year
, Day_of_Week
.Dim_Product
: Product_ID
, Product_Name
, Category_ID
, Brand_ID
, Price
.Dim_Category
: Category_ID
, Category_Name
.Dim_Brand
: Brand_ID
, Brand_Name
.Dim_Customer
: Customer_ID
, Customer_Name
, City_ID
, Phone_Number
.Dim_City
: City_ID
, City
, State
.Dim_Store
: Store_ID
, Store_Name
, City_ID
, Manager_Name
.Snowflake Schema
Fact_Sales
Txn_ID | Date_ID | Product_ID | Customer_ID | Store_ID | Qty_Sold | Total_Amount |
---|---|---|---|---|---|---|
101 | 20250101 | 1 | 1001 | 501 | 2 | 5000 |
102 | 20250102 | 2 | 1002 | 502 | 1 | 1500 |
103 | 20250103 | 3 | 1003 | 503 | 5 | 1000 |
Dim_Date
:
Date_ID | Date | Month | Quarter | Year | Day_of_Week |
---|---|---|---|---|---|
20250101 | 2025-01-01 | January | Q1 | 2025 | Wednesday |
20250102 | 2025-01-02 | January | Q1 | 2025 | Thursday |
20250103 | 2025-01-03 | January | Q1 | 2025 | Friday |
Dim_Product
:
Product_ID | Product_Name | Category_ID | Brand_ID | Price |
---|---|---|---|---|
1 | Saree | 1 | 1 | 2500 |
2 | Kurta-Pajama | 2 | 2 | 1500 |
3 | Turmeric Powder | 3 | 3 | 200 |
Dim_Category
:
Category_ID | Category_Name |
---|---|
1 | Sarees |
2 | Ethnic Wear |
3 | Spices |
Dim_Brand
:
Brand_ID | Brand_Name |
---|---|
1 | Nalli |
2 | Fashion India |
3 | ABC |
Dim_Customer
:
Customer_ID | Customer_Name | City_ID | Phone_Number |
---|---|---|---|
1001 | Priya Sharma | 1 | 9876543210 |
1002 | Rajesh Patel | 2 | 8765432109 |
1003 | Anjali Singh | 3 | 7654321098 |
Dim_City
:
City_ID | City | State |
---|---|---|
1 | Mumbai | Maharashtra |
2 | Ahmedabad | Gujarat |
3 | Delhi | Delhi |
Dim_Store
:
Store_ID | Store_Name | City_ID | Manager_Name |
---|---|---|---|
501 | Nalli Silks | 4 | Ramesh Kumar |
502 | Fashion India | 5 | Sunita Reddy |
503 | ABC Spices | 3 | Amit Sharma |
Querying Data:
Fact_Sales
table with multiple dimension tables using their respective keys.Benefits:
The snowflake schema is a powerful data modeling technique that introduces normalization to reduce redundancy and improve data integrity. While it is complex than the star schema, it is well-suited for scenarios requiring detailed and structured data relationships.