This project is a backend FastAPI application designed to demonstrate how to scrape data from the football section of the website livescore.in. The project is strictly for instructional and educational purposes only. It is not intended for effective web scraping or storing live score data.
Scraping content from websites without permission can violate their terms of service and legal policies. This project:
- Implements rate limiting to reduce requests.
- Simulates human behavior through a wait mechanism to avoid overloading the target site.
- Does not store any scraped data.
The primary goal is to provide developers with an educational example of how to structure a web scraping backend with FastAPI.
livescore-api
├── app
│ ├── routers
│ │ ├── archive.py
│ │ ├── country.py
│ │ ├── league.py
│ │ ├── match.py
│ ├── services
│ ├── models
│ │ ├── archive_schemas.py
│ │ ├── country_schemas.py
│ │ ├── league_schemas.py
│ │ ├── match_schemas.py
│ │ ├── utils.py
│ ├── scraper
│ ├── archive_scraper.py
│ ├── country_scraper.py
│ ├── leagues_scraper.py
│ ├── match_scraper.py
│ ├── scraper.py
│ ├── utils.py
├── tests
│ ├── main.py
├── logger
├── config.py
├── .gitignore
├── README.md
├── requirements.txt
app/routers: Contains API endpoints for different scraping functionalities.app/services/models: Contains schemas for validation and utility functions for data processing.app/services/scraper: Core logic for scraping the livescore football page and individual match data.config.py: Variables for configuration.logger: Custom logging configurations for monitoring application behavior.tests: Contains test cases to validate the scraping logic and API functionality.
The project uses an config.py file to manage configuration. Below are the available variables:
DEBUG=False
URL_LIVESPORT=https://www.livescore.in/football/
URL_LIVESPORT_MATCH=https://www.livescore.in/match/{MATCH_ID}/#/match-summary/match-statistics/0
TIMEOUT=30
LIMIT=10
RATE_LIMITING_FREQUENCY=2/1minute
RATE_LIMITING_ENABLE=True
SIMULATE_WAITING_HUMAN_BEING=10
DEBUG: Toggle debug mode (default:False).URL_LIVESPORT: Base URL for scraping football scores.URL_LIVESPORT_MATCH: URL template for scraping match-specific statistics.TIMEOUT: Timeout in seconds for each request on Livesport.LIMIT: Maximum number of click on 'show-more' buttons on Livesport.RATE_LIMITING_FREQUENCY: Limits the number of requests per minute (e.g.,2/1minuteallows 2 requests per minute).RATE_LIMITING_ENABLE: Enables or disables rate limiting.SIMULATE_WAITING_HUMAN_BEING: Simulates a human delay (in seconds) to mimic user behavior.
- Python 3.9+
- Virtual Environment (optional but recommended)
-
Clone the repository:
git clone <repository-url> cd livescore-api
-
Create and activate a virtual environment:
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install the dependencies:
pip install -r requirements.txt
-
Set up the environment variables: Create a
.envfile in the root directory and populate it with the variables listed above. -
Run the application:
uvicorn app.main:app --reload
This project can also be built and run using Docker.
- Build the Docker image:
docker build -t livescore-api . - Run the Docker container:
docker run -p 8000:8000 livescore-api
The application will be available at http://localhost:8000.
- Archive Data (
/archive): Retrieve historical data for football matches. - Country Information (
/country): Scrape data related to football leagues by country. - League Data (
/league): Fetch details of specific leagues. - Match Data (
/match): Scrape and return statistics of a specific match using theMATCH_ID.
This project is designed for educational purposes only. By running or using this project, you agree:
- Not to use it for unauthorized or illegal scraping.
- To comply with all applicable laws and the target website's terms of service.
Rate limiting and human-like delay mechanisms are implemented to minimize server impact and simulate realistic behavior. This project does not promote or endorse unethical practices.
This project is licensed under the MIT License. See the LICENSE file for details.
This project is for educational purposes only. Unauthorized scraping may result in legal action. Use responsibly and at your own risk.