Skip to content

ana-Analyses/sql-data-warehouse-project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

73 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Warehouse & Analytics Project

Welcome to the Data Warehouse & Analytics Project repository! Design, as part of my Portfolio Projects, showcases industry best practices in data engineering and data analytics. This Project demonstrates a comprehensive Data Warehousing and Analytics solution, building a data warehouse to generate actionable insights.

Data Architecture


The data architecture for this project follows Medallion Architecture Bronze, Silver, and Gold layers:

Data Ar Diagram drawio

Bronze Layer:

Stores raw data as-is from the source systems. Data is ingested from CSV Files into SQL Server Database.

Silver Layer:

This layer includes data cleansing, standardisation, and normalisation processes to prepare data for analysis.

Gold Layer:

Houses business-ready data modelled into a star schema required for reporting and analytics.

Project Overview

This project involves:

-Data Architecture:

--Designing a Modern Data Warehouse Using Medallion Architecture Bronze, Silver, and Gold layers.

-ETL Pipelines:

--Extracting, transforming, and loading data from source systems into the warehouse.

-Data Modelling:

--Developing fact and dimension tables optimised for analytical queries.

-Analytics & Reporting:

--Creating SQL-based reports and dashboards for actionable insights.

Project Requirements

Building the Data Warehouse (Data Engineering)

Develop a modern Data Warehouse using SQL Server to consolidate sales data, enabling analytical reporting and informed decision-making.

Specification

-Data Sources:

--Import data from two source systems (ERP and CRM) provided as CSV files.

-Data Quality:

--Cleanse and resolve data quality issues prior to analysis.

-Integration:

--Combine both sources into a single, user-friendly data model designed for analytical queries.

-Scope:

--Focus on the latest dataset only; historization of data is not required.

-Documentation:

--Provide clear documentation of the data model to support both business stakeholders and analytics teams.

BI: Analytics & Reporting (Data Analysis)

Objective

Develop SQL-based analytics to deliver detailed insights into: -Customer Behaviour -Product Performance -Sales Trends

These insights empower stakeholders with key business metrics, enabling strategic decision-making.


Repository structure

data-warehouse-project/

datasets/

  • Raw datasets used for the project (ERP and CRM data)

docs/

  • Project documentation and architecture details
    • etl.drawio -- Draw.io file shows all different techniques and methods of ETL
    • data_architecture.drawio -- Draw.io file shows the project's architecture
    • data_catalog.md -- Catalogue of datasets, including field descriptions and metadata
    • data_flow.drawio -- Draw.io file for the data flow diagram
    • data_models.drawio -- Draw.io file for data models (star schema)
    • naming-conventions.md -- Consistent naming guidelines for tables, columns, and files

scripts/

  • SQL scripts for ETL and transformations

bronze/

  • Scripts for extracting and loading raw data

silver/

  • Scripts for cleaning and transforming data

gold/

  • Scripts for creating analytical models

tests/
-Test scripts and quality files

README.md/

  • Project overview and instructions

LICENSE/

  • License information for the repository

gitignore/

  • Files and directories to be ignored by Git

requirements.txt/

  • Dependencies and requirements for the project

About

Data Warehouse with SQL Server, including ETL processes, Data Modelling, and Analytics.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages