An AI-powered blog post generation system that uses multiple language models in parallel and iteratively improves prompts through programmatic evaluation against published content.
This system generates high-quality blog posts by:
- Multi-model Generation: Using Claude, GPT-4, and local models in parallel
- Braintrust Integration: Tracking experiments and evaluating outputs
- Iterative Improvement: Automatically refining prompts based on comparative analysis
- Style Analysis: Learning from published posts to match writing patterns
graph TD
A[Published Posts] --> B[Post Analyzer]
B --> C[Style Patterns Extracted]
C --> D[Prompt Generator]
D --> E[Multi-Model Generation]
E --> F[Comparative Evaluator]
F --> G[Feedback Manager]
G --> H[Iterative Improver]
H --> D
I[Braintrust] --> J[Experiment Tracking]
E --> J
F --> J
H --> J
K[ContentIndexer] --> E
L[BlogGenerator] --> E
- Analyzes 20 recent published posts to extract writing patterns
- Generates improved prompt variations based on evaluation feedback
- Runs 20 iterations to find optimal prompt configuration
- Achieved 81.7% best score in testing
- Structure Analysis: Paragraph flow, hook effectiveness, conclusion impact
- Content Quality: Data integration, voice authenticity, topic relevance
- Style Matching: Sentence length, transitions, business focus
- AP English Grading: 0-100 scoring with detailed feedback
- Claude Sonnet: Primary generation model
- GPT-4: Alternative perspective generation
- Local Models: Cost-effective iterations via Ollama
- Parallel Processing: Faster generation cycles
- Experiment tracking and comparison
- Score normalization (0-1 scale)
- Detailed evaluation metrics
- Performance trend analysis
- Python 3.13+
- uv package manager (recommended) or pip
- API keys for:
- Anthropic Claude (required)
- OpenAI GPT-4 (optional)
- Google Gemini (optional)
- Braintrust (optional for experiment tracking)
# 1. Clone the repository
git clone <repository-url>
cd evo_blog_public
# 2. Run automated setup
python setup_evo_blog.py
# 3. Follow the prompts to configure your API keys
# The script will:
# - Create virtual environment (.venv)
# - Install all dependencies
# - Set up configuration files
# - Test the installation# 1. Clone the repository
git clone <repository-url>
cd evo_blog_public
# 2. Create virtual environment
uv venv --python=3.13 .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# 3. Install dependencies
uv pip install -r requirements.txt
# 4. Configure environment variables
cp .env.example .env
# Edit .env with your API keys
# 5. Setup configuration files
cp config/model_configs.json.example config/model_configs.json
# Edit config/model_configs.json with your API keys (alternative to .env)
# 6. Test installation
python scripts/generate_blog_post.py "Test topic: AI in startups" --cycles 1The system supports two methods for API key configuration:
# Copy the example file
cp .env.example .env
# Edit .env with your keys:
ANTHROPIC_API_KEY=your_claude_api_key_here
OPENAI_API_KEY=your_openai_api_key_here
GOOGLE_API_KEY=your_google_api_key_here
BRAINTRUST_API_KEY=your_braintrust_api_key_here# Copy the example config
cp config/model_configs.json.example config/model_configs.json
# Edit config/model_configs.json:
{
"anthropic_api_key": "your_claude_api_key_here",
"openai_api_key": "your_openai_api_key_here",
"google_api_key": "your_google_api_key_here"
}For basic functionality, you only need:
- Anthropic Claude API key (required for generation)
- Python 3.13+ with uv/pip
Optional components:
- OpenAI API key (for GPT-4 generation)
- Google API key (for Gemini generation and LLM-as-judge evaluation)
- Braintrust API key (for experiment tracking and advanced evaluation)
python scripts/generate_blog_post.py \
--topic "AI in Enterprise Software" \
--model claude-3-5-sonnet-20241022 \
--style analytical# Run 20 iterations of prompt improvement
python scripts/iterative_improver.py
# Analyze results
ls iterative_improvements/run_*/summary_report.mdpython scripts/post_analyzer.pypython scripts/prompt_generator.py \
--feedback "Improve data integration depth" \
--iteration 5python scripts/comparative_evaluator.py \
--ai-post generated_post.md \
--reference-post published_post.mdevo_blog/
βββ scripts/
β βββ generate_blog_post.py # Main blog generation
β βββ iterative_improver.py # Orchestrates improvement cycles
β βββ post_analyzer.py # Analyzes published posts
β βββ prompt_generator.py # Generates improved prompts
β βββ comparative_evaluator.py # Compares AI vs published posts
β βββ feedback_manager.py # Manages evaluation feedback
β βββ braintrust_integration.py # Braintrust tracking
βββ config/
β βββ model_configs.json # Model configurations
β βββ evaluation_criteria.json # Evaluation criteria
βββ generations/ # Generated blog posts
βββ iterative_improvements/ # Improvement run results
βββ requirements.txt
βββ README.md
- Structure Flow (25%): Paragraph transitions and logical flow
- Opening Hook (20%): First paragraph engagement
- Conclusion Impact (15%): Final paragraph effectiveness
- Data Integration (20%): Use of statistics and examples
- Voice Authenticity (20%): Match to target writing style
- Initial Score: 78.6%
- Best Score: 81.7% (Iteration 13)
- Final Score: 75.5%
- Total Iterations: 20
- Structural adherence - AI consistently matches paragraph patterns
- Data integration - Effective use of statistics and company examples
- Business focus - Maintains analytical tone and practical insights
- Length control - Stays within 500-600 word target
- Voice authenticity - Slightly more formulaic than human writing
- Transition smoothness - Could improve flow between concepts
- Industry context - Needs deeper domain knowledge integration
- Nuanced analysis - Could benefit from more sophisticated reasoning
{
"claude-3-5-sonnet-20241022": {
"temperature": 0.7,
"max_tokens": 2000,
"system_prompt": "optimized_prompt_v13.txt"
}
}{
"structure_flow": {
"weight": 0.25,
"description": "Logical flow and paragraph transitions"
},
"voice_authenticity": {
"weight": 0.20,
"description": "Match to target writing style"
}
}- Fork the repository
- Create a feature branch
- Add tests for new functionality
- Submit a pull request
MIT License - see LICENSE file for details
- Contributors and the open source community
- Braintrust for evaluation infrastructure
- Anthropic Claude for content generation