📚 AI-Powered Bookmark Assistant

An intelligent assistant that reads your Chrome bookmarks, generates short descriptions for them, stores them in a persistent FAISS vector database, and lets you query them conversationally using gpt-4.1 and a streaming Gradio chat UI.

🚀 Project Overview

This project creates a smart, searchable chatbot based on your Chrome bookmarks. It:

Extracts all your bookmarks from Chrome automatically.
Generates meaningful descriptions in parallel using a powerful LLM.
Stores bookmark content and metadata in a local persistent FAISS vector database.
Retrieves relevant bookmarks based on user queries using semantic search.
Offers a themed chat interface with real-time streaming built with Gradio 5.

This allows you to ask questions like:

"Show me resources about machine learning fundamentals."

"Find articles related to personal finance advice."

And the assistant will retrieve the best-matching bookmarks instantly.

🛠 Features

📥 Bookmark Extraction: Recursively reads and flattens Chrome's hierarchical bookmark structure.
✍️ Async Description Generation: Uses gpt-4.1 with parallel async calls (semaphore-limited) to generate concise summaries for each bookmark.
🗂 Persistent Vector Search Engine: Embeds and indexes bookmarks using OpenAI's text-embedding-3-large model and stores them with FAISS for fast semantic search.
🤖 Conversational Agent: A ReAct-style agent using LangGraph with a comprehensive system prompt to handle complex queries.
🧠 In-Memory Checkpointing: Maintains state between interactions for a smoother chat experience.
💬 Streaming Chat UI: Built with Gradio 5 featuring a themed interface, example queries, and adjustable settings.
⚙️ Configurable: All settings (model, retrieval count, paths) configurable via .env file.
🔄 Auto-Migration: Automatically migrates legacy pickle caches to JSON format.

📂 Project Structure

Bookmark_AI/
├── .env.example              # Environment variable template
├── .gitignore
├── CLAUDE.md                 # Claude Code instructions
├── README.md
├── LICENSE
├── run.py                    # Entry point (python run.py)
├── requirements.txt          # Direct dependencies only
├── bookmark_app/
│   ├── __init__.py           # Package marker + version
│   ├── __main__.py           # python -m bookmark_app support
│   ├── config.py             # .env loading, settings, path detection, logging
│   ├── bookmarks.py          # Chrome extraction, JSON cache, pickle migration
│   ├── descriptions.py       # Async LLM description generation with batching
│   ├── vectorstore.py        # FAISS vector store management
│   ├── agent.py              # LangGraph ReAct agent with system prompt
│   └── ui.py                 # Gradio 5 UI with streaming + main() orchestrator

File / Folder	Purpose
`all_bookmarks.json`	JSON cache of enriched bookmarks with descriptions (auto-generated)
`vector_store/`	Persistent FAISS vector store with embedded documents
`.env`	Your local environment variables (copy from `.env.example`)

📋 How It Works

Load Configuration: Reads .env file, sets up logging, and validates that the API key and bookmarks file exist.
Extract Bookmarks: Parses Chrome's JSON-formatted bookmark file into a flat list of {folder, name, url} entries.
Merge with Cache: Compares freshly extracted bookmarks against the JSON cache (URL-based matching). Preserves existing descriptions, adds new bookmarks, drops removed ones.
Generate Descriptions: For bookmarks without a description, makes parallel async calls to gpt-4.1 (up to 5 concurrent) to generate concise summaries. Failures produce graceful fallbacks.
Embed and Store: Converts bookmark content into embeddings and stores them using a FAISS vector database. Only new bookmarks are embedded on subsequent runs.
Setup Retrieval Agent: Creates a ReAct agent with a system prompt that instructs it to always search bookmarks and format results as clickable markdown links.
Launch Streaming Chat UI: Runs a themed Gradio interface with token-by-token streaming, example queries, and a settings panel for adjusting the number of results retrieved.

🧩 Dependencies

langchain / langchain-community / langchain-openai
langgraph
openai
faiss-cpu
gradio
python-dotenv

Install all dependencies via:

pip install -r requirements.txt

🖥️ How to Run

Clone the repository and create a virtual environment:

git clone https://github.com/pouriamrt/Bookmark_AI.git
cd Bookmark_AI
python -m venv .venv
.venv\Scripts\activate   # Windows
# source .venv/bin/activate  # macOS/Linux
pip install -r requirements.txt

Set up your environment:

cp .env.example .env
# Edit .env and add your OPENAI_API_KEY

Run the app:

python -m bookmark_app
# or
python run.py

The Gradio interface will launch automatically! Open the provided URL in your browser.

The app auto-detects your Chrome bookmarks file. If it can't find it, set BOOKMARKS_PATH in your .env file.

⚙️ Configuration

All settings are optional and configurable via .env:

Variable	Default	Description
`OPENAI_API_KEY`	(required)	Your OpenAI API key
`BOOKMARKS_PATH`	Auto-detected	Path to Chrome's Bookmarks file
`LLM_MODEL`	`gpt-4.1`	LLM model for descriptions and agent
`EMBEDDING_MODEL`	`text-embedding-3-large`	Embedding model for vector search
`RETRIEVAL_K`	`10`	Number of results per search query
`LOG_LEVEL`	`INFO`	Logging level (DEBUG, INFO, WARNING, ERROR)

📝 Example Usage

User Input:

"What are some useful resources on AI ethics?"

Bot Response:

AI and Ethics Blog — A curated collection of articles exploring the ethical challenges in AI development and deployment.

Responsible AI Toolkit — Tools and frameworks for building fair and transparent AI systems.

📢 Notes

If new bookmarks are added to Chrome, the assistant will automatically detect and describe them on the next run.
Bookmark data and embeddings are stored locally — no cloud storage or external data saving.
Legacy all_bookmarks.pkl files are automatically migrated to JSON format on first run.
The project uses gpt-4.1 and text-embedding-3-large models via the OpenAI API. Make sure you have access.

⚡ Future Improvements

Improve description generation with website scraping.
Support Firefox, Edge, and other browsers.
Add bookmark categorization and tagging.

📜 License

This project is open-source and available for use under the MIT License.

🙌 Acknowledgments

Thanks to the open-source community around LangChain, LangGraph, OpenAI, and Gradio for making rapid prototyping like this possible.

📝 How to Cite

If you use the AI-Powered Bookmark Assistant in your research, projects, or publications, please cite:

Software

Mortezaagha, P. (2025). AI-Powered Bookmark Assistant (v2.0.0). GitHub. https://github.com/pouriamrt/Bookmark_AI

BibTeX

@software{mortezaagha_ai_bookmark_assistant_2025,
  author  = {Mortezaagha, Pouria},
  title   = {AI-Powered Bookmark Assistant},
  version = {2.0.0},
  year    = {2025},
  url     = {https://github.com/pouriamrt/Bookmark_AI},
  license = {MIT},
  note    = {Gradio + LangGraph chatbot for semantic search over Chrome bookmarks with persistent FAISS vector database}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📚 AI-Powered Bookmark Assistant

🚀 Project Overview

🛠 Features

📂 Project Structure

📋 How It Works

🧩 Dependencies

🖥️ How to Run

⚙️ Configuration

📝 Example Usage

📢 Notes

⚡ Future Improvements

📜 License

🙌 Acknowledgments

📝 How to Cite

BibTeX

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
.github/workflows		.github/workflows
bookmark_app		bookmark_app
.env.example		.env.example
.gitignore		.gitignore
Bookmark_AI_Architecture.png		Bookmark_AI_Architecture.png
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
run.py		run.py

License

pouriamrt/Bookmark_AI

Folders and files

Latest commit

History

Repository files navigation

📚 AI-Powered Bookmark Assistant

🚀 Project Overview

🛠 Features

📂 Project Structure

📋 How It Works

🧩 Dependencies

🖥️ How to Run

⚙️ Configuration

📝 Example Usage

📢 Notes

⚡ Future Improvements

📜 License

🙌 Acknowledgments

📝 How to Cite

BibTeX

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages