This project analyzes customer shopping behavior to uncover purchasing patterns, revenue drivers, and customer segments. It demonstrates an end-to-end data analytics workflow using Python, SQL, and Power BI, transforming raw data into actionable business insights.
The business lacked a clear understanding of which customer segments and purchasing behaviors were driving revenue growth.
Despite offering subscriptions, discounts, and multiple shipping options, there was limited visibility into how these strategies influenced customer spending, loyalty, and retention.
As a result, marketing and promotional efforts were not fully optimized, leading to missed opportunities to increase subscription adoption, strengthen customer loyalty, and improve overall profitability.
This project aims to use data-driven analysis to identify high-value customer segments, evaluate the impact of discounts and shipping preferences, and support more effective business decision-making.
- Python: Pandas, SQLAlchemy, PyMySQL
- SQL: MySQL
- Power BI: Interactive dashboards & KPIs
Data Source β Python Cleaning β MySQL Analysis β Power BI Visualization
Python was used to clean, transform, and prepare the dataset for analysis.
Key steps included:
- Loading and exploring the dataset
- Handling missing values using median imputation
- Standardizing column names
- Creating age group segments using quantile binning
- Removing redundant columns
- Uploading cleaned data into MySQL using SQLAlchemy
π Code location: python/customer_shopping_behavior.ipynb
SQL queries were written to answer real business questions such as:
- What is the revenue split by gender?
- Do subscribed customers spend more than non-subscribers?
- Which products receive the highest customer ratings?
- How do discounts affect purchasing behavior?
- How are customers segmented (New, Returning, Loyal)?
- Which age groups contribute the most revenue?
π Queries available in: sql/customer_shopping_behavior.sql
An interactive Power BI dashboard was developed to visualize key metrics and trends, including:
- Total customers, average purchase amount, and average review rating
- Revenue and sales by product category
- Subscription status distribution
- Revenue and sales by age group
- Dynamic filtering by gender, category, subscription status, and shipping type
π Dashboard screenshots available in: power_bi
- Boost Subscriptions β Promote exclusive benefits for subscribers
- Customer Loyalty Programs β Reward repeat buyers to move them into the βLoyalβ segment.
- Review Discount Policy β Balance sales boosts with margin control
- Product Positioning β Highlight top-rated and best-selling products in campaigns.
- Targeted Marketing β Focus efforts on high-revenue age groups and express-shipping users
Tshedza Tshipuke
Aspiring Data Analyst
- GitHub: https://github.com/Itspukke
- LinkedIn: www.linkedin.com/in/tshedza-tshipuke-468516119





