Transportation Data Science Project
Overview
This study analyzes how vehicle characteristics—such as manufacturing age, make, and model—impact collision rates in New York City. Using data from the New York Police Department’s Motor Vehicle Collisions dataset, the research explores patterns and correlations between different vehicle types and their involvement in crashes.
Methodology
Dataset: NYPD Motor Vehicle Collisions
Tools & Libraries: Python, Pandas, Matplotlib, Seaborn, Folium
Analysis Techniques: Descriptive statistics Data visualizations (bar charts, scatter plots, maps)
Key Findings
- Vehicle Make & Model Sedans and sports cars were involved in the most crashes. Toyota Camry was the most frequently involved model, possibly due to its high prevalence in NYC. Sports vehicles, designed for speed, may be at higher risk due to riskier driving behavior.
- Vehicle Age Newer vehicles were involved in fewer crashes during their early years of release. Collision rates dropped in 2019, likely due to the COVID-19 pandemic reducing road traffic. Vehicles with advanced safety features appear to have lower crash rates.
Limitations & Future Research
The dataset lacks information on vehicle safety features and vehicles not involved in crashes. Future studies should include safety feature data and explore how they influence accident rates. Understanding these patterns is crucial for improving urban road safety.
Acknowledgments
Special thanks to the Northeast Big Data Innovation Hub, National Student Data Corps, and the U.S. Department of Transportation Federal Highway Administration for their support.