This document outlines the end-to-end pipeline for character training using OpenAI fine-tuning, orchestrated by the full_automation CLI. The pipeline is designed to be fully automated, covering character creation, data generation, fine-tuning, and comprehensive evaluation.
The pipeline uses OpenAI's gpt-4.1-mini-2025-04-14 model for fine-tuning and runs 4 evaluation configurations to compare:
- Base model with character system prompt
- Base model without character system prompt
- Fine-tuned model with character system prompt
- Fine-tuned model without character system prompt
The pipeline consists of 6 automated stages:
- Character Registration: Registers the character in
character_definitions.json(skips if already exists) - AI Enhancement: Uses Claude Sonnet 4 to enhance the character specification
- Traits & Facts Derivation: Automatically derives traits and key facts from the character
- Behavior Setup: Ensures self-knowledge evaluation exists and writes behavior examples
- OpenAI Fine-tuning: Generates 2000 synthetic chats and fine-tunes the model
- Comprehensive Evaluation: Runs 4 evaluation configurations in parallel
# Run the complete automation with a new character
python -m full_automation.cli \
--character-id my_character_id \
--name "Character Name" \
--version "Version" \
--system-prompt "Your character's system prompt here..." \
--total-chats 2000 \
--ft-model gpt-4.1-mini-2025-04-14# Run with existing character (skips steps 1-4)
python -m full_automation.cli \
--character-id existing_character_id \
--name "Character Name" \
--version "Version" \
--system-prompt "Your character's system prompt here..." \
--total-chats 2000 \
--ft-model gpt-4.1-mini-2025-04-14 \
--start-from-step 5# Test the workflow without making API calls
python -m full_automation.cli \
--character-id your_character_id \
--name "Character Name" \
--version "Version" \
--system-prompt "Your system prompt..." \
--dry-run# Start from step 5 (skip character setup, go straight to fine-tuning)
python -m full_automation.cli \
--character-id existing_character_id \
--start-from-step 5
# Start from step 6 (skip to evaluation only)
python -m full_automation.cli \
--character-id existing_character_id \
--start-from-step 6The CLI automatically runs these commands in sequence:
# Character registration (skipped if exists)
# AI enhancement with Claude Sonnet 4
# Traits and facts derivation
# Behavior setup with self-knowledge evaluation# Generate 2000 synthetic chats with mixed dataset (0.2 basic questions)
python evals/finetuning_data_generation/chat_generation.py generate_chats \
--character_id=CHARACTER_ID \
--output_path=evals/finetuning/CHARACTER_ID_TIMESTAMP \
--total_chats_target=2000 \
--basic_question_percentage=0.2# Prepare OpenAI-compatible training data
python evals/finetuning/prepare_openai_finetune_data.py \
--input evals/finetuning/CHARACTER_ID_TIMESTAMP/CHARACTER_ID/synth_chats.jsonl \
--output-dir evals/finetuning/CHARACTER_ID_TIMESTAMP/ft_data \
--sample-size 2000 \
--val-size 100 \
--format messages
# Run OpenAI fine-tuning
python evals/finetuning/run_openai_finetuning.py \
--train_file evals/finetuning/CHARACTER_ID_TIMESTAMP/ft_data/train.jsonl \
--model gpt-4.1-mini-2025-04-14 \
--n_epochs 1 \
--learning_rate_multiplier 1.0 \
--suffix CHARACTER_ID_TIMESTAMP# Evaluation 1: Base model with character
python auto_eval_gen/scripts/run_parallel_configs.py \
--teacher-model claude-sonnet-4 \
--student-model gpt-4.1-mini-2025-04-14 \
--character CHARACTER_NAME \
--character-full CHARACTER_ID \
--num-workers 10 \
--max-concurrent 30 \
--num-variations 5 \
--iterations-per-variation 1 \
--timestamp CHARACTER_TIMESTAMP_base_with_char
# Evaluation 2: Base model without character
python auto_eval_gen/scripts/run_parallel_configs.py \
--teacher-model claude-sonnet-4 \
--student-model gpt-4.1-mini-2025-04-14 \
--character CHARACTER_NAME \
--character-full default \
--num-workers 10 \
--max-concurrent 30 \
--num-variations 5 \
--iterations-per-variation 1 \
--timestamp CHARACTER_TIMESTAMP_base_without_char
# Evaluation 3: Fine-tuned model with character
python auto_eval_gen/scripts/run_parallel_configs.py \
--teacher-model claude-sonnet-4 \
--student-model ft:gpt-4.1-mini-2025-04-14:YOUR_FINETUNED_MODEL_ID \
--character CHARACTER_NAME \
--character-full CHARACTER_ID \
--num-workers 10 \
--max-concurrent 30 \
--num-variations 5 \
--iterations-per-variation 1 \
--timestamp CHARACTER_TIMESTAMP_ft_with_char
# Evaluation 4: Fine-tuned model without character
python auto_eval_gen/scripts/run_parallel_configs.py \
--teacher-model claude-sonnet-4 \
--student-model ft:gpt-4.1-mini-2025-04-14:YOUR_FINETUNED_MODEL_ID \
--character CHARACTER_NAME \
--character-full default \
--num-workers 10 \
--max-concurrent 30 \
--num-variations 5 \
--iterations-per-variation 1 \
--timestamp CHARACTER_TIMESTAMP_ft_without_char- OpenAI API key set in environment:
export OPENAI_API_KEY="sk-..." - Anthropic API key for character enhancement
- All dependencies installed from
requirements.txt
The pipeline generates:
- Fine-tuned model ID in
evals/finetuning/finetuned_models_openai.json - Evaluation results in
auto_eval_gen/results/transcripts/ - Comprehensive comparison across all 4 evaluation configurations