Skip to content

Muhammad-Zohaib007/Deep-Dive-Video-Note-Taker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

22 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸš€ Deep-Dive Video Note Taker

LLM + RAG

πŸŽ₯ Local Video β†’ Structured Notes + Timestamps + Action Items + RAG Q&A

CPU-only β€’ Privacy-First β€’ Offline-Capable β€’ LLM Powered

Convert long YouTube videos, lectures, and meetings into structured knowledge β€” locally.



πŸ”Ž What Is This?

Deep-Dive Video Note Taker (Lite) is a local AI system that converts long videos into:

  • πŸ“Œ Structured notes
  • ⏱️ Key timestamps
  • βœ… Action items
  • 🧠 RAG-based Q&A with citations

No cloud upload required. Everything runs locally using:

  • whisper.cpp
  • sentence-transformers
  • ChromaDB
  • Ollama (LLM)

🧠 Architecture Overview

flowchart LR
    A[Video Input] --> B[Audio Extraction]
    B --> C[Speech-to-Text]
    C --> D[Chunk + Embed]
    D --> E[Vector DB]
    E --> F[LLM Notes Generator]
    E --> G[RAG Q&A]
Loading

✨ Core Features

πŸŽ₯ Input

  • YouTube URL
  • Local video file
  • Batch processing

πŸ“ Output

  • Structured summary
  • Multi-level notes
  • Timestamped highlights
  • Action item extraction
  • Export to Markdown / JSON / Obsidian / Notion

🧠 Intelligence Layer

  • Semantic chunking
  • Embedding-based retrieval
  • RAG pipeline
  • Citation-backed answers

⚑ Quick Start

1️⃣ Install Dependencies

pip install poetry
poetry install

2️⃣ Install Ollama Model

ollama pull llama3.1:8b

3️⃣ Process a Video

poetry run notetaker process "https://www.youtube.com/watch?v=VIDEO_ID"

4️⃣ Ask Questions (RAG)

poetry run notetaker query VIDEO_ID "What were the main insights?"

🌐 Web UI + REST API

Start server:

poetry run notetaker serve

Open:

API Endpoints

POST   /api/process
POST   /api/process/upload
GET    /api/status/{job_id}
GET    /api/notes/{video_id}
GET    /api/transcript/{video_id}
POST   /api/query/{video_id}
GET    /api/library
GET    /api/export/{video_id}?format=json|markdown|obsidian|notion
DELETE /api/video/{video_id}

πŸ“¦ Tech Stack

Speech-to-Text
whisper.cpp
Embeddings
sentence-transformers
Vector DB
ChromaDB
LLM
Ollama (llama3.1:8b)

βš™οΈ Configuration

User config:

~/.notetaker/config.yaml

Environment variables:

NOTETAKER_OLLAMA_BASE_URL
NOTETAKER_OLLAMA_MODEL
NOTETAKER_WHISPER_MODEL
NOTETAKER_DATA_DIR
NOTETAKER_NOTION_API_KEY

πŸ“€ Exports

  • JSON
  • Markdown
  • Obsidian (YAML + callouts)
  • Notion blocks JSON

🐳 Docker

docker compose up --build

App β†’ http://localhost:8000 Ollama β†’ http://localhost:11434


❓ FAQ

Does it upload my videos?

No. Everything runs locally.

Can I use it offline?

Yes β€” fully offline if Ollama + models are installed.

Can I search my entire video library?

Yes β€” semantic retrieval via ChromaDB.

Example Queries

  • β€œSummarize the lecture in 5 bullet points”
  • β€œList action items from 00:20–00:40”
  • β€œWhere was regression discussed?”

πŸ§ͺ Development

Run tests:

poetry run pytest -v

Lint:

poetry run ruff check .

🀝 Contributing

PRs welcome.

  1. Fork
  2. Create branch
  3. Add tests
  4. Open PR

πŸ“„ License

MIT

About

Tech Stack: LLM + Speech-to-Text + Summarization + RAG

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors