Daam Track BD is an open-source research project dedicated to tracking and visualizing commodity prices in Bangladesh over long periods. The goal is to build a transparent, historical record of daily necessities—such as rice, vegetables, fish, and other essentials—to analyze inflation trends and market volatility.
This project uses a modern, "serverless" architecture designed for high performance and zero-maintenance costs.
- React: UI library for building a responsive, interactive dashboard.
- Vite: Next-generation build tool for fast development.
- Tailwind CSS: Utility-first CSS framework for rapid, consistent styling.
- Apache ECharts: Powerful visualization library capable of rendering tens of thousands of data points smoothly (essential for 10+ year timelines).
- DuckDB WASM: An in-browser SQL OLAP database. It allows the frontend to query large, compressed Parquet files directly from the static server, enabling powerful analytics without a backend API.
- Source: Currently scraping Chaldal.com once daily.
- Playwright: Headless browser automation for robust scraping of dynamic e-commerce sites.
- Pandas: Data manipulation and cleaning.
- Parquet: The critical storage format. Data is saved in highly compressed, columnar Parquet files partitioned by year.
To prevent the main application repository from becoming bloated with daily data files, we use a Dual-Branch Strategy:
mainbranch: Contains the application code (React, Vite, Scraper logic).databasebranch: Contains ONLY the result data (Parquet files, JSON metadata, images).
How it works (Smart Merging): During the automated daily scrape (via GitHub Actions):
- The workflow checks out the code from
main. - It then checks out the
databasebranch into the/publicfolder. - The scraper runs and writes new data into
/public/data. - Since
/publicis tracking thedatabasebranch, the workflow simply commits and pushes these changes back to thedatabasebranch. - The
mainbranch remains clean, while the App is able to fetch data via raw GitHub URLs pointing to thedatabasebranch.
The scraper relies on a categories.json file to know which URLs to visit.
fetch_categories.py: This script navigates the specific structure of the target site (currently Chaldal) to discover all available product categories and generate thejsonmapping.- Note: This logic is site-specific and will need adjustment if specific target sites change.
- Production: The app is hardcoded to fetch data from the remote
databasebranch. - Development:
- By default, the app looks for local data in
/public/data(which is gitignored inmain). - Fake Data Generator: For local development, we use
generate_fake_data.py. This script populates/public/datawith 10 years of realistic synthetic data, allowing you to stress-test the charts without needing the massive real dataset. - Data Toggle: In dev mode, a "Use Remote Data" toggle appears in the UI. This allows checking the real production data without building the app.
- By default, the app looks for local data in
/src: React frontend application./scraper: Python scripts using Playwright./public/data: (Gitignored inmain) Location for local/fake data..github/workflows: Actions for daily scraping and site deployment.
- Node.js (v18+)
- pnpm (v8+)
- Python (v3.9+)
-
Clone the repo
git clone https://github.com/yourusername/daam-track-bd.git cd daam-track-bd -
Frontend Setup
pnpm install pnpm dev
-
Generating Fake Data (For Dev) Since the real data lives on a truncated branch, you need local data to see the charts work in dev mode.
# Generates 10 years of synthetic history in /public/data python generate_fake_data.py -
Running the Scraper (Optional)
cd scraper pip install -r requirements.txt playwright install chromium # 1. Update Category List (Optional) python fetch_categories.py # 2. Scrape Prices python main.py
Contributions are welcome! Whether it's adding new data sources, improving the categorization algorithm, or enhancing the chart visualization.
MIT License - Free to use, modify, and distribute for any purpose.
