📚 Level 5 Data Engineer Learning Journal

🌟 Overview and Purpose

This repository serves as my official Learning Journal and portfolio for the Level 5 Data Engineer Apprenticeship.

It is a documentation of my practical experience, knowledge gained, project work, and personal reflections aligned with the apprenticeship curriculum and the Data Engineer (Level 5) standard.

Repository Structure

The journal is organized into modules/folders reflecting the core areas of data engineering practice, ensuring all competencies required for the End-Point Assessment (EPA) are logged and demonstrable.

./01_Core_Concepts/: Notes, definitions, and foundational knowledge (e.g., Data Architecture, Ethics, Governance).
./02_SQL_Data_Modelling/: SQL scripts, data modeling diagrams, and database practice.
./03_Python_ETL/: Python scripts for data manipulation (Pandas), scripting, and basic ETL/ELT processes.
./04_Data_Pipelines_Orchestration/: Code and configurations for building, automating, and monitoring data workflows (e.g., Airflow, Azure Data Factory).
./05_Cloud_Infrastructure/: Notes and setup scripts (IaC) related to cloud platforms (AWS/Azure/GCP) for data solutions.
./06_Capstone_Project/: The final, significant project used for EPA preparation (e.g., building a complete, end-to-end data platform).
./Documentation/: Technical documentation, requirements gathering, and professional discussion preparation.

🛠️ Key Skills and Technologies Demonstrated

This journal documents hands-on mastery in the following core areas:

1. Programming & Data Processing

Python: Advanced scripting, data manipulation with Pandas/Numpy, and software development best practices (testing, version control).
SQL: Complex queries, stored procedures, performance tuning, and database administration (DDL/DML).
PySpark/Scala (Optional): Working with distributed computing frameworks for Big Data processing.

2. Data Infrastructure & Storage

Data Modelling: Relational (3NF/Dimensional) and NoSQL modeling for different use cases.
Data Warehousing: Concepts of Data Lakes, Data Lakehouses, and Data Marts (e.g., Snowflake, Microsoft Fabric).
Cloud Platforms: Implementation of data solutions using [AWS / Azure / GCP] services (e.g., S3/Blob Storage, EC2/VMs, RDS/Managed Databases).

3. Data Flow & Automation (ETL/ELT)

Pipeline Tools: Building and managing robust data pipelines using tools like Apache Airflow, Azure Data Factory, or similar orchestrators.
Streaming: Experience with batch, micro-batch, and real-time streaming concepts (e.g., Kafka, Azure Stream Analytics).

4. Software Engineering & DevOps

Version Control: Professional usage of Git and GitHub for collaborative development.
Containerization: Introduction to Docker for dependency management and reproducible environments.
CI/CD: Implementing basic Continuous Integration and Continuous Deployment for data pipelines (e.g., GitHub Actions).

Name		Name	Last commit message	Last commit date
Latest commit History 212 Commits
1. Data Fundamentals		1. Data Fundamentals
2. Databases & Data Lakes		2. Databases & Data Lakes
3. Programming & Scripting Essentials		3. Programming & Scripting Essentials
4. Network Essentials & Cyber Security		4. Network Essentials & Cyber Security
5. Cloud Engineering		5. Cloud Engineering
6. Data Collection & Ingestion		6. Data Collection & Ingestion
7. Data Collection & Ingestion P2		7. Data Collection & Ingestion P2
8.Data Pipelines		8.Data Pipelines
9. Data Responsibilities		9. Data Responsibilities
Notes		Notes
Off-the-Job		Off-the-Job
Self-directed learning		Self-directed learning
Conor Bell OTJ Tracker DE L5.xlsx		Conor Bell OTJ Tracker DE L5.xlsx
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📚 Level 5 Data Engineer Learning Journal

🌟 Overview and Purpose

Repository Structure

🛠️ Key Skills and Technologies Demonstrated

1. Programming & Data Processing

2. Data Infrastructure & Storage

3. Data Flow & Automation (ETL/ELT)

4. Software Engineering & DevOps

About

Uh oh!

Releases

Packages

Languages

CB-LBG/Level-5-Data-Engineer-Learning-Journal

Folders and files

Latest commit

History

Repository files navigation

📚 Level 5 Data Engineer Learning Journal

🌟 Overview and Purpose

Repository Structure

🛠️ Key Skills and Technologies Demonstrated

1. Programming & Data Processing

2. Data Infrastructure & Storage

3. Data Flow & Automation (ETL/ELT)

4. Software Engineering & DevOps

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages