Skip to content

bgerd/promptlab

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

promptlab

Version control for the prompts themselves — not just the code around them.

This is a GitHub Template Repository. Click Use this template to create your own workspace — a clean repo, no fork, no upstream history. Everything below becomes your README.

Prompt engineering has a reproducibility problem. Prompts live in chat windows, get tweaked without record, and when something works you can't always say what changed. promptlab gives each prompt a home: a versioned file (prompt.md) with session history, context, and evaluation criteria alongside it.

  • One prompt, one project — each prompt lives in its own directory with everything related to it
  • Version history that tracks the why — not just diffs, but the sessions and context that led to each change
  • Privacy by default — your experimental work is gitignored; you explicitly publish versions worth sharing
  • Platform-agnostic — the same structure works whether you're prompting Claude, GPT, Gemini, or a local model
  • Zero dependencies — bash and git. That's it.

Getting Started

Use the template: Click Use this template on GitHub to create your own repo. You get a clean workspace — your own commit history, no connection to the upstream repo. This is the intended workflow: one promptlab repo per person or team, containing all your prompt projects.

Or clone manually:

git clone https://github.com/USER/promptlab.git my-prompts
cd my-prompts
rm -rf .git && git init

Create a project

./scripts/new-project.sh customer-support-agent

This creates:

customer-support-agent/
├── prompt.md          ← The prompt. This is the artifact.
├── VERSION            ← Current version number (e.g., "1.0")
├── README.md          ← What this prompt does, how to evaluate it
├── sessions/
│   └── current/       ← Sessions for the active prompt version
├── context/
│   └── current/       ← Environment for the active version
├── versions/          ← Archived prompt snapshots
└── .gitignore         ← Privacy rules

Open prompt.md and write (or paste) your system prompt. This is the artifact you'll iterate on.

Start a session

./scripts/new-session.sh customer-support-agent
# → Creates sessions/current/YYYY-MM-DD/ with a notes.md template

Use the session to test your prompt in Claude, GPT, or whatever you're using. After your session, edit notes.md — what worked, what didn't, what to change.

Check status

./scripts/status.sh customer-support-agent

Workflow

The core loop is: write prompt → test in session → capture notes → iterate → version when ready.

         ┌─────────────────────────────────────────────┐
         │                                             │
    Edit prompt.md ──→ Run session ──→ Capture notes ──┘
         │
         │  Ready to lock in?
         ▼
    Version bump ──→ Archive saved, continue iterating

Most days look like this:

# Start a session for today
./scripts/new-session.sh my-project

# ... do your work in Claude/GPT/etc ...
# ... jot observations in the notes.md that was created ...
# ... tweak prompt.md based on what you learned ...

When a prompt version is solid, lock it in:

# Small improvement — snapshot and keep going
./scripts/new-version.sh my-project --minor
# v1.0 → v1.1: archives a copy, you keep working with current files

# Fundamental rework — archive and start fresh
./scripts/new-version.sh my-project --major
# v1.1 → v2.0: archives everything, current/ starts empty

When you want to share a version with the team:

./scripts/publish-version.sh my-project 1.0 --prompt --sessions
# Makes versions/v1.0.md and sessions/v1.0/ visible to git

Key Concepts

promptlab organizes three things that prompt engineers juggle constantly:

  • The prompt (prompt.md) — the artifact you're building. The thing you paste into Claude or GPT or Gemini.
  • Sessions (sessions/) — the evidence. What happened when you used the prompt. What worked, what broke, what you learned.
  • Context (context/) — the environment. Everything the model needs beyond the prompt text: reference docs, persona descriptions, tool config, evaluation criteria.

These three things live together and get versioned as a unit. What you actually put inside each session or context directory is entirely up to you and your team. The structure gives you a place to put things — it doesn't tell you what to put there.

Why Version Them Together?

A prompt that worked beautifully in v1.0 might fall apart when you change the reference documents it was designed around. A prompt that looks broken might actually be fine — the session notes tell you it was a model regression, not a prompt problem. If you only snapshot the prompt text, you lose the ability to answer why did this version work?

When you bump a version, the system archives all three as a unit:

versions/v1.0.md        ← the prompt text at v1.0
sessions/v1.0/          ← every session you ran against v1.0
context/v1.0/           ← the environment v1.0 was designed for

Six months from now, when someone asks "what was the v1.0 coaching prompt and how did it perform?", the answer is right there — not scattered across chat logs, Slack threads, and someone's memory.

Versioning

Two-level: MAJOR.MINOR. No patch versions — prompt engineering is inherently iterative, not bugfix-oriented.

Bump When to use What happens by default
--minor Tweaked the prompt, improved it Snapshot: copies current/ to archive, keeps your working files intact
--major Rethought the approach entirely Fresh: moves current/ to archive, gives you a clean slate

The two defaults reflect how prompt work actually goes. A minor improvement means you're still in the same line of thinking — you want continuity, so current/ is copied to the archive and you keep going. A major rework means the old approach has run its course — current/ moves to the archive and you start with a blank slate.

You can override with --snapshot or --fresh when the defaults don't fit:

# Major change but want to keep current files as a starting point
./scripts/new-version.sh my-project --major --snapshot

Sessions

A session is one sitting with your AI — whatever that means for your work. The only thing the system provides is a dated directory and a notes.md template. Everything else is yours.

new-session.sh handles naming automatically:

Directory How it's created
2026-03-08/ First session today
2026-03-08-2/ Second session today
2026-03-08--socratic/ Session testing the "socratic" variant

What goes inside a session is up to you. Some people write a paragraph in notes.md and move on. Others save full transcripts, generated artifacts, API logs, even a snapshot of the exact context window for reproducibility. Both are fine. Use what helps you learn from the session and improve the prompt.

Here's one layout a team might adopt — not a prescription, just an example of what's possible:

sessions/current/2026-03-08/
├── notes.md              # What happened, what you learned
├── transcript.md         # Full or excerpted conversation
├── outputs/              # Files the AI generated
│   ├── response-1.md
│   └── analysis.csv
├── context-snapshot/     # What the model actually saw (for reproducibility)
│   ├── system-prompt.md  # Copy of prompt.md at time of session
│   └── uploaded-doc.pdf  # Context files that were active
└── logs/                 # Execution traces
    └── api-log.json

The context-snapshot/ pattern is worth calling out: it captures what the model actually saw in a specific conversation, which is different from the project-level context/ directory. If you care about exact reproducibility, this is how you get it. If you don't, skip it.

Context

context/current/ is where you put everything the model needs beyond the prompt text. This is the project-level environment — shared across all sessions for the active version.

What belongs here is up to you. It depends on your prompt, your platform, and your workflow. A simple project might have a single reference PDF. A complex one might have structured config, evaluation rubrics, and multiple knowledge files.

Here's what it could look like — not what it must look like:

context/current/
├── config/
│   ├── platform.yaml       # Model, temperature, max tokens
│   └── mcp-servers.json    # MCP tool configuration
├── evaluation-criteria.md  # How you judge if the prompt is working
├── reference-doc.pdf       # Knowledge the AI needs
└── persona-notes.md        # Background material

The key distinction: context/current/ is the shared environment for all sessions under the active version. A context-snapshot/ inside a specific session (see above) captures the exact inputs for that one conversation. One is project configuration; the other is a reproducibility record.

Privacy

Every sub-project's .gitignore starts with the same rule: current work is private.

What Tracked by git?
prompt.md, VERSION, sessions/current/, context/current/ No — always private
versions/v1.0.md, sessions/v1.0/, context/v1.0/ No — private by default
Published versions (via publish-version.sh) Yes
README.md, .gitignore Yes

This means you can push the repo without exposing experimental prompts, client data in session notes, or half-baked ideas. You choose exactly what to share:

# Share just the prompt
./scripts/publish-version.sh my-project 1.0

# Share everything from that version
./scripts/publish-version.sh my-project 1.0 --prompt --sessions --context

# Changed your mind
./scripts/publish-version.sh my-project 1.0 --unpublish

A/B Testing with Variants (Experimental)

This feature is experimental. The basic building blocks are here — you can create variant prompts, tag sessions by variant, and promote a winner with promote-variant.sh. But there's no tooling yet for comparing variant sessions against control sessions or analyzing results across variants. If you use variants, expect the promotion workflow to work but comparison/analysis to be manual.

The idea: test alternate phrasings or approaches side by side, using session tagging to track which variant produced which results.

# Scaffold the variants directory
./scripts/new-project.sh my-project --with-variants
# Or just mkdir my-project/variants/ on an existing project
my-project/
├── prompt.md                      # Control
├── variants/
│   ├── socratic.md                # Variant A
│   └── concise.md                 # Variant B

Tag sessions with the variant you tested:

./scripts/new-session.sh my-project --variant socratic
# → sessions/current/2026-03-08--socratic/

When a variant wins, promote it:

# Preview what will happen
./scripts/promote-variant.sh my-project socratic --dry-run

# Promote: archives everything, makes the variant the new prompt.md, bumps version
./scripts/promote-variant.sh my-project socratic

# Keep the winning variant's sessions in current/ for continuity
./scripts/promote-variant.sh my-project socratic --keep-sessions

This archives the old prompt and all sessions, copies the variant to prompt.md, removes the variant file, and bumps the version. Sessions tagged with any variant's name keep their --suffix in the archive as historical context.

CLI Reference

All scripts live in scripts/ and are self-documenting (--help or run with no arguments).

new-project.sh <name> [--public] [--with-variants]

Creates a new sub-project from the template.

  • --public — skip privacy rules (for open-source prompts)
  • --with-variants — include a variants/ directory

new-session.sh <name> [--variant <variant-name>]

Creates a dated session directory with a notes.md template. Auto-increments if today already has a session.

new-version.sh <name> --major|--minor [--snapshot|--fresh]

Archives the current prompt and session history, bumps the version number.

  • --minor (default: snapshot) — iterative improvement
  • --major (default: fresh) — fundamental rework

publish-version.sh <name> <version> [--prompt] [--sessions] [--context] [--unpublish]

Controls which archived versions are visible to git. Defaults to --prompt if no flags given.

promote-variant.sh <name> <variant> [--major] [--keep-sessions] [--snapshot] [--dry-run]

Promotes a winning variant to prompt.md, archives the current version, and bumps the version number.

  • --major — major version bump (default: minor)
  • --keep-sessions — copy the winning variant's sessions back to current/
  • --snapshot — keep all sessions and context in current/ instead of starting fresh
  • --dry-run — preview what would happen without making changes

status.sh <name>

Shows the project's current version, session count, archived versions, and published versions. Useful for orientation when returning to a project.

Contributing

See CONTRIBUTING.md for guidelines.

License

MIT

About

Version control for AI prompts — track iterations with session history, context, and evaluation criteria alongside the prompt itself.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages