Network Fetcher for Page Content

## Why

Must retrieve the source document for each saved item. Requires robust network behavior and normalization

## Description of Done

- Given an item identifier and a URL, the fetcher downloads the document with timeouts and redirects handled
- Compressed responses are supported. Character encoding is detected and normalized to UTF-8
- Robots and common anti-bot headers are respected where feasible
- Failures are mapped to returnable vs non-retryable categories
- Unit tests stub network calls and cover timeouts, redirects, bad certificates, and content encodings

## Tasks

- [ ] Add client with connect, request and total timeouts
- [ ] Enable automatic redirect following with safe maximum
- [ ] Set user agent and accept-encoding headers
- [ ] Implement content decoding and character set detection
- [ ] Classify errors: network, server, client, permanent not found
- [ ] Return a typed result used by extractor and by the job runner
- [ ] Write unit tests using a local stub server

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Network Fetcher for Page Content #21

Why

Description of Done

Tasks

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Network Fetcher for Page Content #21

Description

Why

Description of Done

Tasks

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions