GoldSight AI is a production-grade MLOps platform designed to predict, track, and monitor the price of Gold. It leverages a serverless architecture on AWS to automate the entire machine learning lifecycle—from daily data ingestion to model serving and drift detection.
- Automated ETL Pipeline: Daily extraction from Yahoo Finance, technical indicator calculation, and storage in S3 as Parquet.
- Versioned Model Registry: Full experiment tracking and model versioning integrated with MLflow and DagsHub.
- Predictive Inference: Serverless model execution via AWS Lambda providing daily "Next Day" forecasts.
- Real-time Monitoring: Automated data drift detection using Evidently AI to ensure model reliability.
- Unified Dashboard: High-fidelity Gradio interface for market visualization and prediction insights.
- Developer API: FastAPI service offering REST endpoints for on-demand predictions.
The system is built on a modern MLOps stack for scalability and cost-efficiency:
- Ingestion & Processing (ETL): AWS Lambda triggered by EventBridge fetches data and saves processed Parquet files to Amazon S3.
- Training Pipeline: retrains models including Classical (Random Forest) and potentially LSTM/ARIMA, registering the best performer as the "Champion" in MLflow.
- Inference Layer: Runs daily to generate forecasts. It performs drift analysis between the training distribution and incoming live data.
- Serving Layer: A Dockerized container running on AWS App Runner that hosts both the Gradio Dashboard and the FastAPI backend.
- Core: Python 3.12 (Pandas, Scikit-learn, MLflow, FastAPI)
- UI/UX: Gradio, Plotly
- Cloud (AWS): Lambda, S3, ECR, App Runner, IAM
- Tooling: Docker, Evidently AI, DagsHub
├── api/ # FastAPI application code
├── dashboard/ # Gradio & Streamlit dashboard implementations
├── etl/ # Data extraction & transformation logic (Lambda)
├── inference/ # Daily prediction & drift detection scripts
├── training/ # Model training & registry logic
├── utils/ # Shared helper functions (S3, MLflow)
├── Dockerfile # Container config for App Runner (Dashboard/API)
├── Dockerfile.lambda # Container config for AWS Lambda functions
├── start.sh # Production startup script
└── requirements.txt # Project dependencies
-
Clone the repository:
git clone https://github.com/martijnooo/prediction-project.git cd prediction-project -
Create Virtual Environment:
python -m venv venv source venv/bin/activate # Windows: venv\\Scripts\\activate pip install -r requirements.txt
-
Configure Environment: Create a
.envfile with your AWS and DagsHub credentials. -
Run Dashboard Locally:
python dashboard/app_gradio.py
- Lambdas: Use
Dockerfile.lambdato build the image and push to ECR. Update Lambda functions to use the latest image. - App Runner: Use
Dockerfileto build the image and push to ECR. Create an App Runner service pointing to this image on Port 8080.
Developed as part of a modern financial MLOps implementation.