Markdown-first web retrieval skill for AI agents (Cloudflare negotiation → Jina → Firecrawl fallback)
Agents often waste tokens on noisy HTML. This skill prioritizes clean markdown responses so downstream summarization/RAG is cheaper and more reliable.
- Cloudflare Markdown Negotiation (
Accept: text/markdown) - Jina Reader (
https://r.jina.ai/<url>) - Firecrawl (
/v1/scrape, markdown format)
python3 scripts/fetch_markdown.py "https://example.com/blog/post"# 抓网页正文并输出干净 Markdown
python3 scripts/fetch_markdown.py "https://example.com/文章"
# 强制走 Jina
python3 scripts/fetch_markdown.py "https://example.com" --strategy jinaForce provider:
python3 scripts/fetch_markdown.py "https://example.com" --strategy jina
python3 scripts/fetch_markdown.py "https://example.com" --strategy firecrawl --firecrawl-api-key "$FIRECRAWL_API_KEY"{
"ok": true,
"strategy": "jina",
"url": "https://example.com",
"markdown": "# Title ..."
}Use this skill when user requests look like:
- fetch/read/summarize this page as markdown
- clean this URL before summarizing
- 抓网页正文
- 提取网页 Markdown
- 网页转 Markdown
- 读取网页并总结
- 这个链接帮我清洗一下
python3 -m unittest tests/test_fetch_markdown.py- Skill entry:
SKILL.md - Script:
scripts/fetch_markdown.py - Strategy reference:
references/strategy-matrix.md
MIT
- 2026-03-11: Skill audit upgrade — normalized SKILL.md frontmatter and revalidated trigger wording/lint compatibility with OpenClaw.