Skip to content
View hsjoi0214's full-sized avatar
💭
GoodDays!
💭
GoodDays!

Highlights

  • Pro

Block or report hsjoi0214

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
hsjoi0214/README.md

PJ · Embedded Software → Data/AI Engineer

E-mobility software engineer for embedded systems. Now I’m transitioning to Data Engineering and ML systems out of strong interest and curiosity.


Professional Background

Experienced Embedded Software Engineer with a strong foundation in automation, data-driven systems, and scalable software architectures.
Currently transitioning into Data Engineering and Applied Machine Learning, leveraging a deep understanding of system design, data flows, and distributed computation.

Technical Alignment with Data Engineering

  • badge
    Engineered complex automation and control systems using PA-Base/Script, an object-oriented scripting environment conceptually similar to Python/C++ which helped me build strong foundations in modular software design, data manipulation, and process automation.

  • badge

    1. Designed and deployed automated data acquisition and transformation pipelines for large-scale battery testing which are analogous to modern ETL (Extract, Transform, Load) workflows in data engineering.
    2. Implemented process control flows via DAG-based orchestration (PA-Graph), mirroring dependency management in tools like Apache Airflow.
  • badge

    1. Developed structured and distributed databases for managing cell, pack, and end-of-line test data which conceptually aligned with PostgreSQL, AWS RDS, and DynamoDB architectures.
    2. Implemented cloud-based data synchronization for global test environments, paralleling AWS S3 and Azure Data Lake solutions.
  • badge

    1. Analyzed large-scale battery performance data to detect trends and anomalies using statistical and algorithmic reasoning and laying groundwork for machine learning workflows.
    2. Built user-facing dashboards (PA-Design) for visualization and reporting, comparable to frameworks like Streamlit or Plotly.
  • badge

    1. Built real-time monitoring solutions for distributed test systems, providing insight into data quality, system health, and performance which conceptually aligned with Prometheus, Grafana, and AWS CloudWatch.
    2. Defined alerting and metric-tracking logic for anomaly detection and proactive maintenance.
  • badge
    Automated deployment and testing pipelines for hardware-software integration which extends continuous integration and delivery (CI/CD) concepts into data and MLOps workflows.

  • badge
    Led global customer training sessions across Europe, the USA, and China, authored internal documentation and user guides to standardize testing and data workflows.

Broader Experience

  • Developed full-stack applications and Data science / ML-based projects, demonstrating proficiency across both software engineering and data infrastructure layers.
  • Familiar with AWS Cloud, Python, SQL, Databricks, Terraform, Docker, and CI/CD pipelines.

My experience in embedded systems taught me to build reliable, data-centric automation in distributed environments :— skills that map directly to modern data engineering and cloud computing.

Email Website Medium


Skills & Transition Path

  • focus
    Transitioning to working with production-grade data engineering, data science, and applied ML projects.

  • skills
    AWS Cloud Solutions: Glue, Lambda, API Gateway, S3, IaC (Terraform, CloudFormation), Simple Data Lake, CloudWatch, Cost Explorer, RDS, DynamoDB, IAM, VPC Security, Databricks, Jenkins (CI/CD), Airflow (DAGs).

  • technical
    AWS (Cloud): Lambda, S3, API Gateway, RDS, DynamoDB, IAM, Service Catalog, Terraform (IaC), CloudWatch, Cost Explorer, EKS, SQS, Glue, Athena, VPC, and others.

    Programming & Tools: Python, SQL, Unix Shell Scripting, PySpark, ETL.

    DevOps & Automation: CI/CD, Git, Jenkins, Airflow, Terraform (IaC), Kafka (Basic), Containerisation (EKS, Docker).

    Design & Architecture: System Design, Client-Server Architecture, Microservices, Serverless Architecture, Event-Driven Architecture, Data Modeling, Database Design.

    Observability & Monitoring: OpenTelemetry (Otel), Jaeger, Databricks, Prometheus, Grafana, custom DIY Monitoring & Observability Panel.


Featured Projects (learning + build)

RAGbot
RAG chatbot for Crime and Punishment — IR + LLM via Streamlit.

Housing Price Prediction
Feature-engineered XGBoost pipeline; Streamlit app; Kaggle RMSE 0.12033.

Brazil Market Expansion
SQL + Tableau dashboards on an artificial Brazil market dataset; structured insights & schema design.

Eniac Discount Analysis
Discount strategy & product segmentation analysis on €7.8M revenue; seasonal demand & margin impact.

Weather App
Minimalist JS + OpenWeather: essentials + outfit suggestions.

Movie Night
CLI scraper curating top 50 films of 2023; filters + GCS/Heroku.


Current Work & Learning

  • working

    1. Knowledge app integrates multiple APIs + Supabase(PostgreSQL) + hosting environment + recommendation system (repo is private, permission-based access). done
    2. Medium Article that explains the detailed workflow of the Knowledge-app. done
    3. Personal blogging website built from scratch — roadmap includes adding a text-to-speech model (private repo).
    4. Medium Article explaining the workings of Recurrent Neural Networks (RNNs) and Convolutional Neural Networks (CNNs) in depth.
  • learning

    1. Agentic Knowledge graphs construction
    2. Building AI Agents and Agentic Workflows

Journey & Achievements

Moving closer to downstream data roles through projects, certifications, and writing:


Collaboration & Contact

  • collaborate
    Open to and excited about collaborating on end-to-end data engineering, data science, and applied ML projects — from small builds to production-grade pipelines.

  • askme
    From embedded systems to end-to-end data workflows: engineering pipelines, applied ML, RAG and deep learning — deployed with DataOps/DevOps practices (CI/CD, IaC, automation, monitoring, Docker/Kubernetes).

  • contact

  • funfact

Pinned Loading

  1. RAGbot RAGbot Public

    Retrieval-Augmented Generation (RAG) chatbot for exploring Crime and Punishment — combines document retrieval with LLMs to deliver context-aware, literary insights via a Streamlit app.

    Python

  2. housing-price-prediction housing-price-prediction Public

    End-to-end Kaggle house price predictor with domain-driven feature engineering, 2-stage feature filtering, and XGBoost — deployed as an interactive Streamlit app.

    Python

  3. brazil-market-expansion brazil-market-expansion Public

    Data storytelling project analyzing an artificial Brazil market dataset — SQL + Tableau dashboards with structured insights and schema design.

  4. eniac-discount-analysis eniac-discount-analysis Public

    Data analysis project exploring discount strategies and product segmentation on €7.8M revenue dataset — uncovering insights on seasonal demand and margin impact.

    Python

  5. movie-night movie-night Public

    Command-line app for discovering and selecting the top 50 movies of 2023 — uses Python web scraping, filtering, and Google Cloud storage to simplify group movie night decisions.

    Python

  6. weather-app weather-app Public

    Minimalist weather app built with JavaScript + OpenWeather API — delivers essentials (temperature, highs/lows, wind, visibility) and offers simple outfit recommendations based on daily forecasts.

    JavaScript