-
-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Fork registry & network tracking — historical fork visibility for project health
Problem
GitHub's Forks API shows a point-in-time snapshot of current public forks, but forks can disappear — deleted, made private, or transferred. The forks_count on repo metadata includes all forks (public, private, and nested), but the enumerated forks list only returns what's publicly visible. There's no built-in way to see:
- Who forked your project historically (including forks that no longer exist)
- When a fork was created vs when it disappeared
- How many forks are hidden (private or deleted) vs visible
For open-source maintainers, fork visibility is a basic project health signal. Knowing your project's reach — who's building on it, which forks are actively maintained, which ones disappeared — helps with community engagement, downstream collaboration, and license compliance awareness.
What GitHub exposes vs what's missing
| Data point | Available? | Source |
|---|---|---|
| Current public forks (owner, created_at, pushed_at) | Yes | GET /repos/{owner}/{repo}/forks |
| Total fork count (including private/nested) | Yes | repo.forks_count in repo metadata |
| Fork creation events (username + timestamp) | Yes, ~90 day window | Events API ForkEvent |
| Fork deletion events | No — no event fired | Not available |
| Identity of private fork owners | No — only count gap visible | N/A |
Clone identity (who ran git clone) |
No — architecturally anonymous | Traffic API gives aggregate counts only |
Key insight: The gap between forks_count and the enumerated forks list reveals hidden forks. And by polling the forks list daily, disappearances can be detected (fork present yesterday, absent today).
Proposed solution
Add a fork registry to GTT's data collection — a historical record of the fork network built from daily snapshots.
Phase 1: Fork list archiving (workflow change)
Add to the daily workflow:
// Fetch current forks list
const forksResp = await github.rest.repos.listForks({
owner, repo, sort: 'newest', per_page: 100
});
// Store daily snapshot
state.forkRegistry = state.forkRegistry || [];
const today = new Date().toISOString().split('T')[0];
const currentForks = forksResp.data.map(f => ({
owner: f.owner.login,
repo: f.full_name,
created: f.created_at,
lastPush: f.pushed_at,
stars: f.stargazers_count
}));
// Detect new forks (in today's list but not in registry)
// Detect disappeared forks (in registry but not in today's list)Store in state.json:
{
"forkRegistry": [
{
"owner": "user123",
"repo": "user123/project-fork",
"firstSeen": "2026-02-15",
"lastSeen": "2026-03-01",
"created": "2026-02-15T10:00:00Z",
"status": "active"
},
{
"owner": "user456",
"repo": "user456/project-fork",
"firstSeen": "2026-02-20",
"lastSeen": "2026-02-25",
"created": "2026-02-20T14:00:00Z",
"status": "disappeared"
}
],
"forkSummary": {
"totalSeen": 14,
"currentlyVisible": 11,
"disappeared": 3,
"hiddenEstimate": 2
}
}Phase 2: Dashboard display
Add to the Community tab:
- Fork registry table — all historically-seen forks with status (active / disappeared / new), last push date, star count
- Fork count delta —
forks_countvs visible forks, showing the hidden fork gap - Timeline markers — when forks appeared and disappeared on the Community Trends chart
- Fork activity indicators — which forks are actively maintained (recent pushes) vs dormant
Phase 3: Fork events (stretch)
Poll the Events API for ForkEvent to capture fork creation timestamps even for forks that are deleted before the next daily snapshot. Events have ~90-day retention, so continuous polling captures what the forks list misses.
Design considerations
- Privacy-conscious framing: The fork registry shows publicly available information (the forks list is public). It adds historical memory, not new surveillance. The "disappeared" status is factual (fork was listed, now isn't), not accusatory.
- API rate limits: The forks endpoint returns up to 100 per page. For repos with 100+ forks, pagination is needed. GTT already runs within Actions' generous rate limits.
- Schema version: This would be part of a schema v4 or v5 bump alongside Daily referrer delta tracking — per-day counts from rolling-window snapshots #46/Daily popular paths delta tracking — per-day page views from rolling-window snapshots #47 (referrer/path deltas).
- Stargazer registry: The same approach works for stargazers — polling
/stargazerswith timestamps to detect un-stars. Could be a companion feature. hiddenEstimate: Calculated asforks_count - len(enumerated_forks). This is an estimate becauseforks_countincludes forks-of-forks in the network.
Acceptance criteria
- Workflow fetches forks list daily and stores in
forkRegistry[]in state.json - New forks detected and timestamped with
firstSeen - Disappeared forks detected (in registry but not in current list) with
lastSeen - Fork count gap (
forks_countvs visible) surfaced in dashboard - Dashboard Community tab shows fork registry table with status indicators
- Historical fork data accumulates across runs (append-only registry)
- Stretch: Events API polling for
ForkEventcaptures transient forks
Related issues
- Refs Daily referrer delta tracking — per-day counts from rolling-window snapshots #46 — Daily referrer delta tracking (same snapshot-delta pattern)
- Refs Daily popular paths delta tracking — per-day page views from rolling-window snapshots #47 — Daily popular paths delta tracking (same pattern)
- Refs Smart star history views — velocity vs. timeline with auto-detection #7 — Star history views (stargazer registry would be the companion)
- Refs Pluggable integration system — external registry tabs (PyPI, npm, ComfyUI, etc.) #10 — Pluggable tabs (fork registry could be a pluggable community module)