Skip to content
/ pdf2md Public

A microservice that does one job. And actually does it well.

Notifications You must be signed in to change notification settings

saimaz/pdf2md

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pdf2md

HTTP service that converts PDF files to Markdown. Built with FastAPI and MarkItDown.

Setup

Requires Python 3.12+ and uv.

uv sync
uv run uvicorn main:app

Or with Docker:

docker build -t pdf2md .
docker run -p 8000:8000 pdf2md

API

POST /convert — upload a PDF, get Markdown back.

curl -X POST http://localhost:8000/convert \
  -F "file=@document.pdf"

PHP (Guzzle):

$response = $client->post('http://localhost:8000/convert', [
    'multipart' => [
        [
            'name'     => 'file',
            'contents' => fopen('/path/to/document.pdf', 'r'),
            'filename' => 'document.pdf',
        ],
    ],
]);

$result = json_decode($response->getBody(), true);
echo $result['markdown'];

Python:

import requests

with open("document.pdf", "rb") as f:
    r = requests.post("http://localhost:8000/convert", files={"file": f})

print(r.json()["markdown"])

Go (resty):

resp, _ := resty.New().R().
    SetFile("file", "document.pdf").
    Post("http://localhost:8000/convert")

fmt.Println(resp.String())

Response:

{
  "markdown": "# pdf content...",
  "processing_time_ms": 142.5
}

GET /health — returns {"status": "ok"}

GET /metrics — Prometheus metrics (conversion duration, file sizes, total counts)

About

A microservice that does one job. And actually does it well.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published