Zero-Trust Sustainability Intelligence Platform
Hackathon Winner Candidate π | DoΔuΕ Teknoloji Green Intelligence
DarkData Hunter is a privacy-first, AI-powered audit tool designed to identify, classify, and eliminate "Dark Data" (Redundant, Obsolete, Trivial files) within corporate environments. By reducing digital waste, it directly lowers carbon emissions and cloud storage costs.
- π Zero-Trust Architecture: All analysis is performed on-premise using local LLMs (Ollama). No data ever leaves your secure network.
- π‘οΈ PII Masking Engine: Automatically detects and redacts personal information (KVKK/GDPR) before processing.
- π§ AI Analyst: Ask questions to your data naturally (e.g., "What is the biggest source of waste?").
- πΈοΈ Duplicate Network: Visualizes spread of duplicate files using a network graph.
- π Future Projection: Simulates cost and carbon savings over 5 years.
- π Enterprise Ready: Integrated Authentication, Role-based Access, and Audit Logging (SQLite).
- Python 3.10+
- Ollama: Must be installed and running.
- Pull a model:
ollama pull llama3.1:8b(orgemma2)
- Pull a model:
- Graphviz: Required for network visualization. Download here.
# 1. Clone the repository
git clone https://github.com/yourusername/darkdata-hunter.git
cd darkdata-hunter
# 2. Install Dependencies
pip install -r requirements.txt
# 3. Running Local AI (in separate terminal)
ollama serve
# 4. Launch the Platform
python -m streamlit run app.py- Register/Login: Create a secure admin account on the first launch.
- Configure: In the sidebar, select the Target Directory to scan.
- Audit: Click "Start Audit Scan". The system will index files, check for duplicates, and analyze content usefulness using AI.
- Analyze & Act:
- Review the Green Score and ROI Metrics.
- Use the Duplicate Network tab to find file clusters.
- Auto-Archive redundant files with a single click.
- Download the official PDF Green Audit Certificate.
- Frontend: Streamlit (Custom Glassmorphism UI)
- AI Core: Ollama (Llama 3.1 / Gemma 2)
- Backend: Python, Pandas, NetworkX
- Database: SQLite
- Visualization: Plotly, Graphviz
Every Gigabyte of dark data stored consumes energy 24/7. DarkData Hunter empowers organizations to turn off the lights on digital waste.
Developed for DoΔuΕ Teknoloji Hackathon 2026.