PaperTool is a local-first learning system for research papers with:
- Paper ingestion from a folder
- Search and evidence-grounded Q&A
- Citation graph extraction/view export
- Daily quiz generation weighted toward newer papers
- MCP server for Codex / Claude Code integration
- URL importing (arXiv/PDF/GitHub/X/web pages)
- Local bridge API for browser extension capture
- Reading queue + daily planner (inbox/today/next/later/done)
- Paper-of-the-day + post-read micro-quiz + spaced review
- Daily streaks + Bronze/Silver/Gold medals + local HTML dashboard
- Lightweight resource bookmarks (X/blog/web), topic tags, and paper links
- Ask questions through MCP (
ask_papers) or CLI (papertool ask). - Build a citation graph from local IDs plus conservative title fallback.
- Generate daily quiz questions with stronger weighting for recently ingested papers.
- Recycle previously incorrect quiz prompts in the next batches at an 8:2 new-to-old mix (when enough old prompts exist).
- Import URLs directly into your library from CLI, MCP, or a browser extension.
- Plan a focused daily reading list and run a short post-read quiz loop.
- Track learning streaks and per-paper medals with reversible Silver state.
uv venv --allow-existing .venv --python python3
source .venv/bin/activate
uv pip install -e '.[dev]'This repo now includes a local Claude plugin and canonical skills under skills/.
Plugin manifest:
.claude-plugin/plugin.json
Skill source of truth:
skills/papertool/SKILL.mdskills/obsidian-papertool/SKILL.mdskills/manim-slides/SKILL.md
Manim subskill references:
skills/manim-slides/references/reverse-knowledge-tree.mdskills/manim-slides/references/manim-code-patterns.mdskills/manim-slides/references/benchmark-motifs.mdskills/manim-slides/references/visual-planner.mdskills/manim-slides/references/verbose-prompt-format.mdskills/manim-slides/references/manim-slides-api-cheatsheet.md
Attribution:
- Slide deck creation direction is inspired by Math-To-Manim by HarleyCoops.
- Animation engine lineage references Manim by 3Blue1Brown.
- Slide/presenter workflow references Manim Slides by Jean Eertmans.
Manim phase contract (strict order):
- reverse-knowledge-tree
- manim-code-patterns
- visual-planner
- verbose-prompt-builder
- code synthesis
- hard-gated renderability checks
Manim topic cache path:
.manim-slides/<topic-slug>/
Required cache artifacts:
knowledge_tree.jsonconcept_plan.jsonvisual_plan.jsonverbose_prompt.mdslides.pyrender_report.json
Install plugin in Claude Code:
/plugin install /Users/warrenlow/Documents/projects/papertool
# 1) Activate and install
source .venv/bin/activate
uv pip install -e '.[dev]'
# 2) Initialize config
papertool init --library-dir ./library --db-path ./.papertool/papertool.db
# 3) Import at least one resource
papertool import-url "https://arxiv.org/abs/2205.14135"
# 4) Plan today and start reading flow
papertool today --count 3
papertool paper-of-day --quiz
# 5) After reading, mark done and answer quiz
papertool complete-reading --paper-id <paper_id> --quiz-count 3
papertool submit-answer --question-id <question_id> --answer "..." --score 0.7
papertool review-due --count 5If you want agent integration, run:
papertool mcp-serveIf you want browser capture, run:
papertool bridge --host 127.0.0.1 --port 17345Bridge internals:
- Starts a local HTTP capture API for extension/app integrations.
- Routes captured URLs into PaperTool import + queue logic.
- Stores results in your configured local/hybrid backend.
Create config:
papertool init \
--library-dir ./library \
--db-path ./.papertool/papertool.db \
--retrieval-backend shadow \
--rust-index-dir ./.papertool/index/v1 \
--cluster-mode on_demandThis writes papertool.toml.
Key config flags:
retrieval_backend = "python" | "shadow" | "rust"rust_index_dir = "/absolute/or/relative/path"cluster_mode = "on_demand"storage_backend = "sqlite" | "hybrid" | "couch"couchdb_url,couchdb_db_meta,couchdb_db_events,couchdb_db_jobsremote_api_base_url,remote_api_tokenminio_endpoint,minio_bucket,minio_access_key,minio_secret_keysync_enabled,sync_pull_interval_sec,sync_push_interval_secdaily_goal,goal_timezoneask_confirmation_mode(session|always|never),ask_session_ttl_sec,ask_cli_auto_sessioncitation_refresh_on_importcitation_title_match_mode(conservative|balanced|aggressive)
Use this section for day-to-day commands and behavior details.
Bridge API (extension/app capture):
papertool bridge --host 127.0.0.1 --port 17345How it works:
- Starts a local HTTP capture API.
- Captured URLs are routed through normal import/ingest logic.
- Data is persisted into your configured local/hybrid backend.
Citation graph export:
papertool graph export --format html --output ./.papertool/graph.html
papertool graph export --format json --output ./.papertool/graph.json
papertool graph export --format mermaid --output ./.papertool/graph.mmdHow it works:
- A full citation rebuild runs before export.
- Export fails if rebuild fails, so graph artifacts cannot silently go stale.
- Output formats are alternate views over the same rebuilt citation edges.
Medals and streak dashboard:
papertool medals status --limit 100
papertool medals recompute --from 2026-02-01
papertool medals dashboard --output ./.papertool/medals.htmlMedal Logic:
- A paper is day-qualified only when it is completed that day and has at least one same-day quiz/review answer.
- Bronze is awarded for day-qualified papers on goal-met days; Bronze is permanent.
- Silver requires Bronze and follows latest review score:
>= 0.9means active,< 0.9means inactive. - Gold requires Bronze and at least one linked GitHub repo owned by
DESU-CLUB; Gold is permanent. - Streak increments on goal-met days and resets to
0when the daily goal is missed. - Dashboard output is static HTML generated from DB state.
Ingest papers:
papertool ingestList papers:
papertool listAsk question:
papertool ask "What are the key differences between diffusion and autoregressive models?"
papertool ask "How does MoE routing work?" --topic moe
papertool ask "Summarize FlashAttention" --confirm-mode always
papertool ask "Summarize FlashAttention-2" --session-id study-faSession confirmation behavior:
session(default): first ask for a scope requires confirmation; repeated asks with identical paper scope auto-log.always: always prompt before logging.never: skip confirmation and log immediately.
Search passages directly:
papertool search "flash attention io aware" --top-k 8
papertool search "state space" --community comm:0Build retrieval index and clusters:
papertool index build
papertool index refresh --paper-id <paper_id>
papertool cluster build
papertool cluster list --type topic
papertool cluster papers --topic attentionRebuild and inspect citation links:
papertool citations rebuild
papertool citations rebuild --paper-id <paper_id>
papertool citations status
papertool citations inspect --paper-id <paper_id>Generate quiz:
papertool quiz --count 5Plan your day and get one paper prompt:
papertool today --count 3
papertool paper-of-day
papertool paper-of-day --quizMark a paper complete and generate a micro-quiz:
papertool complete-reading --paper-id <paper_id> --quiz-count 3
papertool submit-answer --question-id <question_id> --answer \"...\" --score 0.6
papertool review-due --count 5Set daily goal and view streak status:
papertool goal set --daily 2 --timezone America/Los_Angeles
papertool goal statusManage medals, repo links, and dashboard:
papertool medals status --limit 100
papertool medals link-repo --paper-id <paper_id> --url "https://github.com/DESU-CLUB/your-repo"
papertool medals recompute --from 2026-02-01
papertool medals dashboard --output ./.papertool/medals.htmlManage queue status:
papertool queue list --status inbox
papertool queue set --paper-id <paper_id> --status next --priority 2.0Import any URL:
papertool import-url "https://arxiv.org/abs/2205.14135"
papertool import-url "https://github.com/Dao-AILab/flash-attention"
papertool import-url "https://x.com/user/status/1234567890"
papertool import-url "https://x.com/user/status/1234567890" --topics "attention,systems" --link-paper-id <paper_id>
papertool import-url "https://myblog.com/post" --kind blog --topics "mamba,architecture"Manage resource bookmarks:
papertool resource list --kind x_post --limit 50
papertool resource show --resource-id <resource_id>
papertool resource tag --resource-id <resource_id> --topics "attention,systems"
papertool resource link --resource-id <resource_id> --paper-id <paper_id> --type related
papertool resource links --paper-id <paper_id>
papertool paper-of-day --show-resourcesRun local bridge server (for extension/app integrations):
papertool bridge --host 127.0.0.1 --port 17345Run remote API and worker (for distributed tailnet captures/sync):
papertool remote serve --host 0.0.0.0 --port 18443
papertool remote worker --poll-interval-sec 5
papertool sync daemon --pull-interval-sec 30
papertool remote health
papertool sync run
papertool sync statusMigration helpers:
papertool migrate export-sqlite --output ./.papertool/migration-export.json
papertool migrate import-couch --input ./.papertool/migration-export.json
papertool migrate verifyFor a Docker-based distributed deployment (CouchDB + MinIO + API + worker), see:
deploy/docker-compose.ymldeploy/README.md(includes full<USER>@<SERVER>setup runbook)
Export graph:
papertool graph export --format json --output ./.papertool/graph.json
papertool graph export --format mermaid --output ./.papertool/graph.mmd
papertool graph export --format html --output ./.papertool/graph.htmlGraph export internals:
graph exportruns a full citation rebuild first.- Export fails if citation rebuild fails, preventing stale graph files.
- Formats (
json,mermaid,html) are different views of the same rebuilt citation edges.
For better rendering stability and math support:
ffmpegpkg-config- Cairo/Pango libraries
- LaTeX toolchain plus
dvisvgm
macOS (Homebrew):
brew install ffmpeg pkg-config cairo pango mactex-no-gui dvisvgmUbuntu/Debian:
sudo apt-get update
sudo apt-get install -y ffmpeg pkg-config libcairo2-dev libpango1.0-dev texlive-full dvisvgmRun:
papertool mcp-serveAvailable MCP tools:
list_papers(limit=100)search_papers(query, top_k=6, topic=null, community_id=null)ask_papers_prepare(question, top_k=6, paper_ids=null, arxiv_ids=null, topic=null, community_id=null, session_id=null, confirm_mode=null)ask_papers_confirm(pending_id, approve, final_answer=null, session_id=null, confirm_mode=null)ask_papers(question, top_k=6, final_answer=null, topic=null, community_id=null, paper_ids=null, arxiv_ids=null, session_id=null, confirm_mode=null)ask_scope_lock_status(session_id, channel="mcp")get_daily_quiz(count=5)submit_quiz_answer(question_id, user_answer, score=null)citation_graph()rebuild_citations(paper_id=null)citation_status()paper_citations(paper_id)import_resource(url, title=null, context_text=null)import_resources(urls)build_retrieval_index(paper_id=null)build_clusters_index()clusters_overview(type=\"topic\"|\"community\", limit=50)cluster_papers(topic=null, community_id=null, limit=100)queue_overview(status=null, limit=50)queue_set(paper_id, status, priority=null)plan_today(max_items=3)paper_of_day(include_quiz=false, quiz_count=3)complete_reading(paper_id, quiz_count=3)due_reviews(count=5)set_daily_goal(daily_goal, timezone="America/Los_Angeles")goal_status()link_paper_repo(paper_id, url)paper_medals(paper_id)medals_overview(limit=100)build_medals_dashboard(output_path=null)recompute_medals(from_day=null)add_resource(url, title=null, notes=null, topics=[], paper_id=null, kind=null)list_resources(kind=null, topic=null, limit=100)resource_details(resource_id)tag_resource(resource_id, topics)link_resource(resource_id, paper_id, link_type="related")paper_resources(paper_id, limit=20)
Use your client's MCP config format and point command to the venv binary, for example:
{
"mcpServers": {
"papertool": {
"command": "/absolute/path/to/.venv/bin/papertool",
"args": ["mcp-serve"],
"cwd": "/absolute/path/to/papertool"
}
}
}PaperTool no longer writes directly to Obsidian.
If you want vault writes, use a Codex skill workflow (for example vault-writer) that:
- resolves the target vault/path from your prompt
- appends your final answer markdown to the target note
- keeps retrieval logs/internal snippets out of notes unless you explicitly ask for them
SQLite DB tables:
papers(metadata + extracted full text)chunks+chunk_fts(FTS5 retrieval index)citations(directed edges between known papers)qa_log(question/answer history)quiz_history(quiz prompts + responses)reading_queue(inbox/today/next/later/done planning state)review_cards(spaced-review schedule and intervals)retrieval_shadow_log(Python vs Rust shadow comparisons)topic_catalog+paper_topic_scores(overlapping topic clusters)citation_communities(citation graph communities)cluster_runs(cluster build run history)goal_settings(daily goal + timezone)daily_progress+daily_qualified_papers(goal and streak state by day)paper_medals+paper_repo_links+medal_events(Bronze/Silver/Gold and audit)resources+resource_topics+paper_resource_links(metadata-only URL enrichment and linking)
A starter Chrome extension is included at chrome-extension/ that sends the current tab URL to your local bridge server or remote API.
- Start bridge server:
papertool bridge. - Open
chrome://extensions. - Enable Developer Mode.
- Click "Load unpacked" and choose
chrome-extension/. - Open arXiv, Google Search, or Google Scholar; inline
Save to PaperToolbuttons appear beside paper-like result titles. - (Optional) Use extension popup to capture any current tab URL.
- For distributed mode, set popup endpoint to your Tailscale host, e.g.
http://<SERVER>:18443, and set Bearer token.
Upload reliability:
- Queue is durable in
chrome.storage.local. - Retry policy is exponential backoff with jitter: 30s -> 60s -> 120s -> 240s -> 480s -> 900s -> 1800s.
- Retries happen on network errors,
429, and5xx. - Other
4xxare marked terminal failures and surfaced in popup queue diagnostics.
The resource is downloaded/converted into library/captures/ and ingested automatically.
- Citation linking currently uses DOI/arXiv identifiers found in reference sections.
- Q&A answering is retrieval-backed and extractive by default (no external LLM call).
- PDF extraction quality depends on text layer quality in PDFs.
- Quiz answers with scores automatically update spaced-review cards (low score resets interval, high score expands interval). Score accepts
0-1or0-10and normalizes to0-1.
Skill sync is for maintainers and local skill distribution:
/Users/warrenlow/Documents/projects/papertool/scripts/sync-skill-targets.shOptional mirrors:
/Users/warrenlow/Documents/projects/papertool/scripts/sync-skill-targets.sh --mirror-codex-home --mirror-claude-homeParity check only:
/Users/warrenlow/Documents/projects/papertool/scripts/sync-skill-targets.sh --checkpytest