OSINT-Harvester is an advanced Python-based command-line and API-driven tool for harvesting emails, subdomains, and intelligence from multiple data sources. It includes CAPTCHA bypassing, proxy support, browser emulation, and risk scoring via VirusTotal and AbuseIPDB. The tool integrates with SpiderFoot, Maltego, FOFA, ZoomEye, and Censys.
-
✅ Email & Subdomain Harvesting:
- Search Engines: Google, Bing, Yahoo, DuckDuckGo
- Certificate Transparency: crt.sh
- DNS Discovery: DNSDumpster
-
🧠 Intelligence & Scoring:
- VirusTotal domain/IP scoring
- AbuseIPDB abuse classification
- Risk-level output
-
🧩 Tool Integrations:
- SpiderFoot (API)
- Maltego (export & transforms)
-
🧱 Modular Architecture:
- Browser-based scraping (Playwright)
- Proxy rotation
- CAPTCHA solving (2Captcha, AntiCaptcha)
- User-Agent spoofing
-
🖥️ Custom Output:
- Formats: JSON, CSV, PDF, TXT
- Slack/Email alerts
-
⚡ FastAPI Backend:
- REST API for remote or frontend access
- Clone the repo
git clone https://github.com/yourusername/osint-harvester.git
cd osint-harvester- Install dependencies
pip install -r requirements.txt- Install Playwright browsers
playwright installCreate a .env file in the root directory with the following:
VT_API_KEY=your_virustotal_key
ABUSEIPDB_API_KEY=your_abuseipdb_key
SPIDERFOOT_API_KEY=your_spiderfoot_key
FOFA_EMAIL=your_email
FOFA_KEY=your_fofa_key
SHODAN_API_KEY=your_shodan_key
CENSYS_API_ID=your_censys_id
CENSYS_API_SECRET=your_censys_secret
2CAPTCHA_API_KEY=your_2captcha_keypython harvest_tool.py --domain example.com --sources google,bing,crtsh --output json| Option | Description |
|---|---|
--domain |
Target domain (e.g., example.com) |
--sources |
Comma-separated list of sources |
--output |
Output format: json, csv, pdf, txt |
--proxy |
Enable proxy rotation |
--headless |
Run in headless browser mode |
Start the backend:
uvicorn backend.main:app --reloadSample API Request (POST /api/scan)
{
"domain": "example.com",
"sources": ["google", "crtsh"],
"use_virustotal": true,
"use_abuseipdb": true,
"output_format": "json"
}-
Console display via rich
-
JSON, CSV, PDF, or TXT files
-
Slack/Email alerts (optional)
-
- Bing
- Yahoo
- DuckDuckGo
-
- crt.sh
- DNSDumpster
- Censys
-
- FOFA
- Shodan
- ZoomEye
| Tool | Support Type |
|---|---|
| SpiderFoot | API integration |
| Maltego | Export & transform |
| VirusTotal | Domain/IP scoring |
| AbuseIPDB | IP abuse lookup |
- JavaScript redirect handling
- CAPTCHA solving:
- ✅ 2Captcha
- ✅ AntiCaptcha
- Headless Playwright emulation
- Modern browser support (Chrome, Brave, Arc)
- Proxy support
- .env config manager
- Modular plugins
- Graceful error handling
- Extendable data pipelines