Skip to content

LKEthridge/Time_Series

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Time_Series

This was a Time Series project for TripleTen. πŸ‘©πŸ½β€πŸ’»

This project builds a time series forecasting model to help Sweet Lift Taxi optimize driver allocation by predicting the number of ride orders for the next hour. The primary objective was to develop a model with an RMSE ≀ 48 on the test set, ensuring forecasts are accurate enough to support staffing decisions during peak periods.

The raw data, provided at 10-minute intervals, was resampled to hourly frequency to capture relevant patterns. Exploratory analysis revealed a slight upward trend, daily seasonality, and spikes in late April and August, likely influenced by seasonal tourism.

Several machine learning models were evaluated, including Random Forest, XGBoost, LightGBM, and CatBoost. While Random Forest performed well on training data (RMSE = 8.32), XGBoost proved the most robust on unseen data, achieving an RMSE of 39.10 on the test setβ€”well within the acceptable range.

The final model, XGBoost, offers accurate, generalizable predictions and serves as a reliable tool to guide driver deployment during high-demand periods.

Skills Highlighted

πŸ‚ Trends and Seasonality πŸ›‘ Stationary Series πŸ”„ Resampling & Feature Engineering πŸ›Ό Rolling Mean & Autoregressive Moving Average πŸ“† Time Series Differences & Forecasting πŸ’― Model Training, Selection, Autoregression, & Forecast Accuracy

Installation & Usage

  • This project uses pandas, numpy, matplotlib.pyplot, seaborn, time, lightgbm, catboost, xgboost, sklearn, statsmodels.tsa.seasonal, and pmdarima. It requires python 3.11.7.