Skip to content

ShineXmRedT14/AsyncCrawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 

Repository files navigation

This is Async Web Crawler: |____what can this project: |____1. It parse urls and get from they hrefs with absolute links. |____2. This code can render JavaScript sites (React, Vue, Angular) if text of response don't have any links. |____3. It can imitate human behavior what help with antibot in sites. |____4. all urls from crawler saved into domains.bd (sqlite) |____Dependencies: |____All dependencies in requirements.txt

How to start this Web Crawler: |____1. install all dependencies |____2. run main.py file

Features: |____1. You will can run code in cmd |____2. Add some optimization |____3. Upgrade Gui in Terminal (now gui in terminal bad)

author - ShineXmRedT14 LICENSE (MIT)

About

Advanced python crawler for moderm JavaScript-based wesites. Designed to extract data from dynamically loaded pages where classic HTML parsing is not enough.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages