Skip to content

Real-repo integration test fixture — version-pinned Way-of-Scarcity/papers snapshots #58

@djdarcy

Description

@djdarcy

Real-repo integration test fixture — version-pinned Way-of-Scarcity/papers snapshots

Problem

Unit tests for ghtraf init use synthetic tmp_path directories — empty folders with no real content. This validates that files are copied, but doesn't test how templates look alongside real repo content, whether init clobbers user files during upgrades, or whether the full pipeline (init → create → configure → deploy) actually works against a live GitHub repository.

We need a real repo with real commit history to test against, and we need to pin specific versions of that repo to test different scenarios:

  • Fresh init: Repo has no GTT files — does init create everything correctly?
  • Init over existing: Repo already has GTT files — does init detect conflicts? Does --skip-existing work? Does --force clobber correctly?
  • Upgrade path: Repo has an older schema version — does init --update (ghtraf init --update — template merge/diff for upgrades #36) handle diffs?
  • Deploy verification: Push to a real GitHub repo and verify the workflow actually runs

Currently /tmp is used for manual testing, which maps to C:\Users\...\AppData\Local\Temp\ on Windows — ephemeral and hard to browse. Test results should be persistent and inspectable.

Proposed solution

Use Way-of-Scarcity/papers as the canonical test repo. It's a simple document repo (README + PDFs) that will eventually have GTT deployed to it for real, creating a natural progression of snapshots:

tests/test-data/repos/papers/
├── fresh/                    # Snapshot at commit before GTT deploy
│   └── README.md             # ~5KB, no .github/ or docs/stats/
├── deployed-v0.3.1/          # Snapshot after first GTT deploy
│   ├── README.md
│   ├── .github/workflows/traffic-badges.yml
│   └── docs/stats/
│       ├── index.html
│       ├── README.md
│       └── favicon.svg
├── deployed-v0.4.0/          # Future: after schema migration
│   └── (updated templates)
└── SOURCE.md                 # Commit hashes, gist IDs, schema versions

Three test tiers, each opt-in:

Tier Command What it does Network?
Unit (default) pytest Tests against committed snapshots No
Integration pytest -m integration Clones live repo, runs init, retains results in test-runs/ Yes
Deploy pytest -m deploy Pushes to real GitHub repo, verifies workflow runs Yes + auth

Version-pinned snapshot strategy

Each snapshot in test-data/repos/papers/ corresponds to a specific commit hash and schema state:

Snapshot papers commit GTT state Tests
fresh/ Current HEAD (pre-GTT) No GTT files Fresh init creates all dirs/files
deployed-v0.3.1/ After first GTT deploy v0.3.1 templates Init detects existing, --skip/--force work
deployed-v0.4.0/ After schema migration v0.4.0 templates + updated state.json Upgrade path, init --update (#36)

SOURCE.md tracks the mapping:

## Snapshots

| Directory | Commit | Date | Schema | Badge Gist | Archive Gist |
|-----------|--------|------|--------|------------|--------------|
| fresh/ | abc1234 | 2026-02-28 | N/A | N/A | N/A |
| deployed-v0.3.1/ | def5678 | 2026-03-XX | v1 | (gist ID) | (gist ID) |

This means we can always verify that improvements to init, configure, and upgrade actually work against real historical states — not just synthetic test data.

Why Way-of-Scarcity/papers

  • Simple: README + PDFs. No complex structure that would interfere with testing.
  • Real: It's a live public repo that will genuinely use GTT for traffic tracking.
  • Stable: Document repos rarely change structure, so snapshots stay relevant.
  • Dual-purpose: The deploy test tier can both test GTT AND keep papers' tracking up to date.
  • Conflict-rich enough: After GTT is deployed, init will find existing .github/workflows/traffic-badges.yml and docs/stats/ — exactly the conflict scenarios we need to test.

What exists today (v0.3.1 Block 3)

The foundation is already built:

  • tests/test-data/repos/papers/README.md — snapshot of current HEAD (fresh, no GTT)
  • tests/test-data/repos/papers/SOURCE.md — metadata and update instructions
  • conftest.pypapers_repo fixture (copies to tmp_path), live_papers_repo (clones from GitHub), integration_output (persistent test-runs/), _force_rmtree (Windows git pack file handler)
  • pytest.iniintegration and deploy markers registered, excluded from default run
  • TestInitRealRepo (5 unit tests), TestInitLiveRepo (3 integration tests), TestInitDeploy (1 stub)

Implementation phases

Phase 1 — Deployed snapshot (after GTT is live on papers):

  • Deploy GTT to Way-of-Scarcity/papers (real ghtraf create + ghtraf init --configure)
  • Record the commit hash, gist IDs, and schema version
  • Copy the resulting .github/ and docs/stats/ into test-data/repos/papers/deployed-v0.3.1/
  • Add tests: init detects existing files, --skip-existing preserves, --force overwrites

Phase 2 — Upgrade testing (when #36 init --update lands):

  • After a schema or template change, create a new deployed-v0.4.0/ snapshot
  • Test that init --update correctly diffs old vs new templates
  • Test that state.json schema migration works against the real gist data

Phase 3 — Live deploy tests (long-term):

  • pytest -m deploy pushes to the real papers repo
  • Verifies the GitHub Actions workflow runs and updates gists
  • Could be triggered after GTT releases to ensure upgrades work
  • The papers repo becomes both a test target and a real user of GTT

Design considerations

  • Snapshot size: Only README.md + template files (~10KB per snapshot). PDFs are never included — they serve no test purpose and would bloat the repo.
  • Gist tracking: Deployed snapshots should record the gist IDs used, so we can verify state.json content against real gist data in integration tests.
  • Schema versioning: Each snapshot's SOURCE.md entry includes the schema version, enabling migration testing across versions.
  • No submodule: Submodules add complexity and require network for git submodule init. Snapshots are self-contained, offline, and deterministic.
  • Windows compatibility: _force_rmtree handles git pack file permissions. Class-scoped fixtures avoid redundant clone/delete cycles.

Acceptance criteria

  • tests/test-data/repos/papers/fresh/ contains README.md from pre-GTT commit (rename current snapshot)
  • tests/test-data/repos/papers/deployed-v0.3.1/ contains post-deploy snapshot with real template files
  • SOURCE.md records commit hashes, dates, schema versions, and gist IDs for each snapshot
  • Unit tests verify fresh init AND init-over-existing using committed snapshots
  • Integration tests (pytest -m integration) clone live repo and retain results in test-runs/
  • Deploy tests (pytest -m deploy) can push to papers repo and verify workflow (stub OK for now)
  • All default tests pass offline with no network dependency
  • Windows _force_rmtree handles git pack file permissions in integration tests

Related issues

Analysis

See 2026-02-28__20-32-08__dev-workflow-process_real-repo-integration-test-fixture.md for the initial analysis that led to the three-tier architecture.

See 2026-02-28__21-10-34__full-postmortem_v0.3.1-block3-testing-and-integration-fixture.md for implementation details and lessons learned.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestintegrationExternal registry integrations (PyPI, npm, ComfyUI, etc.)testingTest infrastructure, harness, and coverage

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions