Skip to content

zegron/webscraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ•ΈοΈ Python Web Scraper with CSV/Excel Export

A flexible Python web scraper that lets you:

  • Input any URL at runtime
  • Preview available HTML tags and select which elements to scrape
  • Export scraped data to CSV or Excel
  • Choose tags dynamically (no hardcoded tag list)
  • Confirm before final scraping
  • Handles invalid input gracefully
  • Displays saved file location at the end

🚨 Important Note

This scraper does not currently support JavaScript-rendered pages.
Support for JS-rendered pages (via Selenium or Playwright) is planned for a future release.


πŸš€ Features

βœ” Dynamic Tag Detection – Pre-scrapes the page and lists all available HTML tags
βœ” User-Controlled Scraping – Select which tags you want to scrape
βœ” Multiple Export Options – Save as CSV or Excel
βœ” Error Handling – Handles invalid choices without crashing
βœ” Clear Exit Options – Press q anytime to quit
βœ” File Path Confirmation – Confirms where your files were saved


πŸ› οΈ Requirements

  • Python 3.8+
  • The following Python libraries (see requirements.txt):
    • requests
    • beautifulsoup4
    • pandas

πŸ“₯ Installation

Clone the repository:

git clone https://github.com/YOUR_USERNAME/python-web-scraper.git
cd python-web-scraper

Install dependencies:

pip install -r requirements.txt

▢️ Usage

Run the scraper:

python scraper.py
  1. Enter the URL you want to scrape
  2. The script previews all HTML tags found
  3. Select tags to scrape (e.g., p, h1, h2)
  4. Choose CSV or Excel output
  5. Confirm and scrape
  6. Files are saved in the current folder, and the path is displayed at the end

πŸ–₯️ Future Enhancements

  • βœ… Add support for JavaScript-rendered pages
  • βœ… Add a Streamlit Web Interface for easy use
  • βœ… Deploy on Streamlit Cloud so anyone can try it online
  • βœ… Support search by CSS selectors or attributes

🀝 Contributing

Pull requests are welcome! For major changes, please open an issue first to discuss what you’d like to change.


πŸ“œ License

MIT

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages