Skip to content

CalledIt is a serverless web application that converts natural language predictions into structured, verifiable formats using AI agents. Built on AWS serverless architecture, it provides a robust platform for creating, managing, and validating predictions with intelligent verifiability categorization.

License

Notifications You must be signed in to change notification settings

dimafarer/calledit

Repository files navigation

CalledIt: A Serverless Prediction Verification Platform

⚠️ DEMONSTRATION PROJECT ONLY
This is a demo/educational project showcasing serverless AI architecture patterns. NOT intended for production use. See License and Disclaimers for important usage restrictions.

CalledIt is a serverless web application that converts natural language predictions into structured, verifiable formats using AI agents. Built on AWS serverless architecture, it provides a robust platform for creating, managing, and validating predictions with intelligent verifiability categorization.

The application combines AWS Cognito for authentication, AWS Lambda for serverless compute, and DynamoDB for data persistence. The frontend is built with React and TypeScript, providing a responsive and intuitive user interface. The backend leverages Strands agents for AI orchestration, Amazon Bedrock for reasoning, and real-time WebSocket streaming for immediate user feedback during prediction processing.

🎯 Key Innovation: Verifiability Categorization

CalledIt automatically classifies every prediction into one of 5 verifiability categories, enabling future automated verification:

  • 🧠 Agent Verifiable - Pure reasoning/knowledge (e.g., "The sun will rise tomorrow")
  • ⏰ Current Tool Verifiable - Time-based verification (e.g., "It's past 11 PM")
  • πŸ”§ Strands Tool Verifiable - Mathematical/computational (e.g., "Calculate compound interest")
  • 🌐 API Tool Verifiable - External data required (e.g., "Bitcoin will hit $100k")
  • πŸ‘€ Human Verifiable Only - Subjective assessment (e.g., "I will feel happy")

Each prediction includes AI-generated reasoning for its categorization, creating a structured foundation for automated verification systems.

Repository Structure

.
β”œβ”€β”€ backend/                      # Backend serverless application
β”‚   └── calledit-backend/
β”‚       β”œβ”€β”€ handlers/            # Lambda function handlers
β”‚       β”‚   β”œβ”€β”€ auth_token/      # Cognito token management
β”‚       β”‚   β”œβ”€β”€ strands_make_call/ # Strands agent with streaming
β”‚       β”‚   β”œβ”€β”€ websocket/       # WebSocket connection handlers
β”‚       β”‚   β”œβ”€β”€ list_predictions/# Retrieve user predictions
β”‚       β”‚   β”œβ”€β”€ write_to_db/     # DynamoDB write operations
β”‚       β”‚   └── verification/    # Automated verification system
β”‚       β”œβ”€β”€ template.yaml        # SAM template for AWS resources
β”‚       └── tests/               # Backend unit tests
β”œβ”€β”€ frontend/                    # React TypeScript frontend
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”œβ”€β”€ components/         # React components with category display
β”‚   β”‚   β”œβ”€β”€ services/          # API, auth, and WebSocket services
β”‚   β”‚   β”œβ”€β”€ types/             # TypeScript interfaces (CallResponse)
β”‚   β”‚   β”œβ”€β”€ hooks/             # Custom React hooks for state management
β”‚   β”‚   └── utils/             # Utility functions
β”‚   └── package.json           # Frontend dependencies
β”œβ”€β”€ testing/                     # Comprehensive testing framework
β”‚   β”œβ”€β”€ active/                 # Working tests (100% success rate)
β”‚   β”œβ”€β”€ integration/            # End-to-end integration tests
β”‚   β”œβ”€β”€ automation/             # Automated testing tools
β”‚   β”œβ”€β”€ deprecated/             # Archived/non-functional tests
β”‚   β”œβ”€β”€ demo_prompts.py         # 40 compelling test prompts (5 categories)
β”‚   β”œβ”€β”€ demo_api_test.py        # WebSocket API testing with results capture
β”‚   └── demo_results_writer.py  # DynamoDB writer for demo data
β”œβ”€β”€ verification/                # Automated verification system (core functionality #2)
β”‚   β”œβ”€β”€ verify_predictions.py   # Main verification runner
β”‚   β”œβ”€β”€ verification_agent.py   # Strands verification agent
β”‚   β”œβ”€β”€ ddb_scanner.py          # DynamoDB scanner for pending predictions
β”‚   └── email_notifier.py       # SNS email notifications ("crying" system)
β”œβ”€β”€ strands/                     # Strands agent development
β”‚   β”œβ”€β”€ demos/                  # Agent development examples
β”‚   └── my_agent/               # Custom agent implementation
β”œβ”€β”€ docs/                       # Organized documentation structure
β”‚   β”œβ”€β”€ current/                # Up-to-date documentation
β”‚   β”‚   β”œβ”€β”€ API.md              # REST and WebSocket API documentation
β”‚   β”‚   β”œβ”€β”€ TRD.md              # Technical Requirements Document
β”‚   β”‚   β”œβ”€β”€ TESTING.md          # Testing strategy and coverage
β”‚   β”‚   β”œβ”€β”€ VERIFICATION_SYSTEM.md # Automated verification documentation
β”‚   β”‚   └── infra.svg           # Infrastructure diagram
β”‚   β”œβ”€β”€ implementation-plans/   # Feature implementation plans
β”‚   β”œβ”€β”€ historical/             # Archived documentation
β”‚   └── archive/                # Deprecated documentation
└── CHANGELOG.md                # Version history and feature tracking

Usage Instructions

Prerequisites

  • Node.js 16.x or later
  • Python 3.12
  • AWS CLI configured with appropriate credentials
  • AWS SAM CLI installed
  • Docker (for local development)
  • Strands agents library (installed via pip)

Installation

Backend Setup

# Set up virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Navigate to backend directory
cd backend/calledit-backend

# Install Python dependencies (including Strands)
pip install -r requirements.txt

# Create SAM config from example
cp samconfig.toml.example samconfig.toml
# Edit samconfig.toml with your stack name and region

# Deploy to AWS
sam build
sam deploy --guided

Frontend Setup

# Navigate to frontend directory
cd frontend

# Install dependencies
npm install

# Create .env file from example
cp .env.example .env

# Update .env with your AWS configuration:
# - Replace YOUR-API-ID with your API Gateway ID
# - Replace YOUR-WEBSOCKET-ID with your WebSocket API ID
# - Replace YOUR-REGION with your AWS region
# - Replace Cognito values with your User Pool details
# - Replace CloudFront domain with your distribution

Testing Setup

# Install testing dependencies
pip install -r testing/requirements.txt

# Validate deployment with automated tests
python testing/verifiability_category_tests.py wss://your-websocket-url/prod

Quick Start

  1. Start the frontend development server:
cd frontend
npm run dev
  1. Open your browser to http://localhost:5173

  2. Log in using your Cognito credentials

  3. Create a prediction using streaming:

    • Click "Streaming Call" tab
    • Enter your prediction in the input field
    • Click "Make Call" and watch real-time AI processing
    • See the verifiability category with visual badge and reasoning
    • Review the generated verification method
    • Click "Log Call" to save your prediction with category

More Detailed Examples

Making a Streaming Prediction with Verifiability Categorization

The application uses Strands agents for intelligent prediction processing with automatic categorization:

// Example streaming prediction flow
1. User enters: "Bitcoin will hit $100k before 3pm today"
2. Strands agent processes with tools:
   - current_time tool for date/time context
   - Reasoning model for verification method generation
   - Verifiability categorization analysis
3. Real-time streaming shows:
   - "Processing your prediction with AI agent..."
   - "[Using tool: current_time]"
   - Generated verification method with timezone handling
   - Category analysis and reasoning
4. Final structured output with verifiability categorization:
{
  "prediction_statement": "Bitcoin will reach $100,000 before 15:00:00 on 2025-01-27",
  "verification_date": "2025-01-27T15:00:00Z",
  "verifiable_category": "api_tool_verifiable",
  "category_reasoning": "Verifying Bitcoin's price requires real-time financial data through external APIs",
  "verification_method": {
    "source": ["CoinGecko API", "CoinMarketCap"],
    "criteria": ["BTC/USD price exceeds $100,000 before 15:00 UTC"],
    "steps": ["Check BTC price at 15:00:00 on January 27, 2025"]
  },
  "date_reasoning": "Converted 3pm to 15:00 24-hour format for precision"
}

UI Display with Category Badges

The frontend displays verifiability categories with visual indicators:

Call Details:
- Prediction: "Bitcoin will hit $100k before 3pm today"
- Verification Date: 1/27/2025, 3:00:00 PM
- Verifiability: 🌐 API Verifiable
- Category Reasoning: "Verifying Bitcoin's price requires real-time financial data..."
- Status: PENDING

Troubleshooting

Common Issues

  1. WebSocket Connection Issues
# Check WebSocket API deployment
aws apigatewayv2 get-apis

# Verify WebSocket URL in frontend .env
# VITE_WEBSOCKET_URL=wss://your-websocket-id.execute-api.region.amazonaws.com/prod
  1. Strands Agent Errors
# Check agent function logs
sam logs -n MakeCallStreamFunction --stack-name calledit-backend

# Verify Strands dependencies in requirements.txt
# strands-agents>=0.1.0
# strands-agents-tools>=0.1.0
  1. Streaming Issues
  • Ensure WebSocket permissions are configured
  • Check connection timeout settings (5 minutes default)
  • Verify Bedrock streaming permissions:
# Required permissions:
# bedrock:InvokeModel
# bedrock:InvokeModelWithResponseStream
# execute-api:ManageConnections
  1. Authentication Issues
# Verify Cognito configuration
aws cognito-idp describe-user-pool --user-pool-id YOUR_POOL_ID

# Check user status
aws cognito-idp admin-get-user --user-pool-id YOUR_POOL_ID --username USER_EMAIL
  1. Deployment Issues
# Check CloudFormation stack status
aws cloudformation describe-stacks --stack-name calledit-backend

# View deployment events
aws cloudformation describe-stack-events --stack-name calledit-backend

# Validate SAM template
sam validate
  1. Verifiability Category Issues
# Test category classification
python testing/verifiability_category_tests.py

# Check agent logs for category processing
sam logs -n MakeCallStreamFunction --stack-name calledit-backend

# Verify category validation logic
# Categories: agent_verifiable, current_tool_verifiable, strands_tool_verifiable, api_tool_verifiable, human_verifiable_only

Data Flow

The application follows a serverless event-driven architecture with real-time streaming capabilities.

User -> Cognito Auth -> WebSocket API -> Strands Agent -> Bedrock (Reasoning)
                    |                      |              |
                    |                      -> Tools -> Real-time Stream
                    |
                    -> REST API -> Lambda Functions -> DynamoDB

Key component interactions:

  1. User authenticates through Cognito user pool
  2. WebSocket connection established for real-time streaming
  3. Strands agent orchestrates between reasoning model and tools
  4. Streaming responses sent back to frontend via WebSocket
  5. Bedrock provides AI reasoning with InvokeModelWithResponseStream
  6. Tools (current_time, etc.) provide context to the agent
  7. Final predictions stored in DynamoDB via REST API
  8. Frontend receives real-time updates during processing

Infrastructure

Infrastructure diagram The application uses the following AWS resources:

API Gateways

  • CallitAPI (AWS::Serverless::Api): REST API for CRUD operations
    • Handles authentication and data persistence
    • Implements CORS and Cognito authorization
  • WebSocketApi (AWS::ApiGatewayV2::Api): Real-time streaming
    • Handles WebSocket connections for streaming responses
    • Routes: $connect, $disconnect, makecall

Lambda Functions

  • MakeCallStreamFunction: Strands agent with streaming via WebSocket
  • ConnectFunction/DisconnectFunction: WebSocket connection management
  • LogCall: Writes predictions to DynamoDB
  • ListPredictions: Retrieves user predictions
  • AuthTokenFunction: Handles Cognito token exchange

AI & Orchestration

  • Strands Agents: Orchestrate between reasoning models and tools
  • Amazon Bedrock: AI reasoning with streaming support
  • Custom Tools: current_time, date parsing utilities

Authentication

  • CognitoUserPool: Manages user authentication
  • UserPoolClient: Configures OAuth flows
  • UserPoolDomain: Provides hosted UI for authentication

Database

  • DynamoDB table "calledit-db" for storing predictions and verification data

Key Features

  • 🎯 Verifiability Categorization: Automatic classification into 5 categories with AI reasoning
  • ⚑ Real-time Streaming: WebSocket-based streaming for immediate feedback
  • πŸ€– Agent Orchestration: Strands agents coordinate AI reasoning and tool usage
  • 🌍 Timezone Intelligence: Automatic timezone handling and 12/24-hour conversion
  • πŸ“‹ Structured Verification: AI-generated verification methods with reasoning
  • πŸ§ͺ Automated Testing: 100% success rate testing suite for all categories
  • πŸ“Š Visual Category Display: Beautiful UI badges with icons and explanations
  • πŸ’Ύ Complete Data Persistence: Categories and reasoning stored in DynamoDB
  • πŸ“’ "Crying" System: Celebrate successful predictions with notifications and social sharing
  • πŸ“§ Email Notifications: Get notified when your predictions are verified as TRUE
  • ⚑ Zero Cold Starts: Provisioned concurrency on critical functions eliminates delays

Deployment

Production Deployment

Prerequisites

  • AWS CLI configured with deployment permissions
  • Virtual environment activated
  • All dependencies installed

Backend Deployment

# Activate virtual environment
source venv/bin/activate

# Navigate to backend
cd backend/calledit-backend

# Build and deploy
sam build
sam deploy --no-confirm-changeset

# Note the output URLs:
# - REST API URL for VITE_API_URL
# - WebSocket URL for VITE_WEBSOCKET_URL

Frontend Deployment

# Navigate to frontend
cd frontend

# Update environment variables
# Edit .env with URLs from backend deployment
VITE_API_URL=https://your-api-gateway-url/Prod
VITE_WEBSOCKET_URL=wss://your-websocket-url/prod

# Build for production
npm run build

# Deploy dist/ folder to your hosting service
# (AWS S3 + CloudFront, Netlify, Vercel, etc.)

Deployment Validation

# Run automated tests to verify deployment
python testing/verifiability_category_tests.py wss://your-websocket-url/prod

# Expected: 100% test success rate across all 5 categories

Testing

Automated Verifiability Testing

The project includes a comprehensive automated testing suite that validates the 5-category verifiability system:

# Run the complete test suite
python testing/verifiability_category_tests.py

# Expected output:
# πŸš€ Starting Verifiability Category Tests
# βœ… Agent Verifiable - Natural Law
# βœ… Current Tool Verifiable - Time Check  
# βœ… Strands Tool Verifiable - Math Calculation
# βœ… API Tool Verifiable - Market Data
# βœ… Human Verifiable Only - Subjective Feeling
# πŸ“Š Success Rate: 100.0%

Test Categories

  • Unit Tests: Backend Lambda functions (/backend/calledit-backend/tests/)
  • Integration Tests: API endpoints and WebSocket flows
  • End-to-End Tests: Complete verifiability categorization validation
  • Performance Tests: Real-time streaming and response times
  • Provisioned Concurrency Tests: Verify zero cold starts on critical functions

Provisioned Concurrency Monitoring

# Test all functions have proper alias + provisioned concurrency setup
python backend/calledit-backend/tests/test_provisioned_concurrency.py

# Expected output:
# 🎯 Overall: 3/3 tests passed
# πŸŽ‰ All provisioned concurrency tests PASSED!

See docs/TESTING.md for comprehensive testing documentation.

Documentation

Core Documentation

Additional Resources

Environment Configuration

Backend Environment Variables

  • Managed automatically by AWS SAM template
  • Cognito User Pool and Client IDs auto-configured
  • DynamoDB table name: calledit-db

Frontend Environment Variables

# .env file
VITE_API_URL=https://your-api-gateway-url/Prod
VITE_WEBSOCKET_URL=wss://your-websocket-url/prod
VITE_APIGATEWAY=https://your-api-gateway-url/Prod

Monitoring & Maintenance

Health Checks

# Check API health
curl https://your-api-gateway-url/Prod/hello

# Check WebSocket connectivity
# Use browser dev tools or WebSocket testing tool

Log Monitoring

# View Lambda function logs
sam logs -n MakeCallStreamFunction --stack-name calledit-backend --tail

# View all function logs
aws logs describe-log-groups --log-group-name-prefix /aws/lambda/calledit-backend

Performance Monitoring

  • CloudWatch Metrics: Lambda invocations, duration, errors
  • API Gateway Metrics: Request count, latency, 4XX/5XX errors
  • DynamoDB Metrics: Read/write capacity, throttling

Rollback Procedures

Backend Rollback

# Rollback to previous version
aws cloudformation cancel-update-stack --stack-name calledit-backend

# Or deploy previous version
git checkout previous-commit
sam build && sam deploy --no-confirm-changeset

Frontend Rollback

# Rollback to previous build
git checkout previous-commit
npm run build
# Redeploy dist/ folder

Project Status

Current Version: v1.5.1 - πŸ”§ PRODUCTION DEPLOYMENT & SECURITY HARDENING (2025-08-23)

  • βœ… Verifiability Categorization System: Complete 5-category classification
  • βœ… Real-time Streaming: WebSocket-based AI processing
  • βœ… Automated Testing: 100% success rate test suite
  • βœ… Visual UI: Category badges with reasoning display
  • βœ… Data Persistence: Complete DynamoDB integration
  • βœ… Comprehensive Documentation: API, TRD, and testing docs
  • βœ… Automated Verification System: Strands agent processes ALL predictions every 15 minutes
  • βœ… Production Deployment: EventBridge scheduling, S3 logging, SNS notifications
  • βœ… Frontend Integration: Real-time verification status display with confidence scores
  • βœ… Tool Gap Analysis: MCP tool suggestions for missing verification capabilities
  • βœ… "Crying" Notifications: Email alerts for successful predictions with social sharing setup
  • βœ… Modern UI Design: Complete responsive redesign with educational UX and streaming text effects
  • βœ… Lambda Provisioned Concurrency: Eliminated cold starts on 3 key functions with alias-based architecture
  • βœ… MCP Sampling Review & Improvement System: FULLY OPERATIONAL
    • Complete MCP Sampling pattern with multiple field updates
    • WebSocket routing for improvement workflow (improve_section, improvement_answers)
    • Server-initiated sampling with client-facilitated LLM interactions
    • Human-in-the-loop design with floating status indicators
    • Date conflict resolution ("today" vs "tomorrow" assumptions)
    • Enterprise-grade state management with 4 custom React hooks
  • βœ… Production Infrastructure: CloudFront deployment with security hardening
    • CloudFront distribution (d2w6gdbi1zx8x5.cloudfront.net) with 10s cache TTL
    • Comprehensive security fixes (KMS encryption, log injection prevention)
    • CORS resolution and mobile UI improvements
    • Environment variable configuration management

βœ… MCP SAMPLING SYSTEM: PRODUCTION READY

  • πŸ” Strands Review Agent: Complete MCP Sampling implementation
    • Multiple field updates: prediction_statement improvements update verification_date and verification_method
    • Date conflict resolution: Handles "today" vs "tomorrow" assumption conflicts intelligently
    • JSON response processing: Proper parsing of complex improvement responses
  • 🌐 WebSocket Infrastructure: Complete routing and state management
    • Full routing: improve_section and improvement_answers with proper permissions
    • Multiple field update handling: Backend processes complex JSON responses
    • Real-time status indicators: Floating UI elements with smart timing
  • 🎨 Enterprise UX: Production-grade user experience
    • 4 custom React hooks: useReviewState, useErrorHandler, useWebSocketConnection, useImprovementHistory
    • Floating review indicator: Always-visible status during improvement processing
    • Smart state management: Proper status clearing and error handling
  • πŸ§ͺ Validation Complete: End-to-end workflow tested and operational
    • Test case: "it will rain" β†’ "NYC tomorrow" β†’ multiple field updates working
    • All components tested: ReviewAgent (10/10), WebSocket routing (3/3), Frontend integration (15/15)
    • Production deployment: All fixes applied and validated

βœ… PREVIOUS: Automated Verification System

  • πŸ€– Strands Verification Agent: AI-powered prediction verification with 5-category routing
  • ⏰ Automated Processing: Every 15 minutes via EventBridge, processes ALL predictions
  • 🎯 Real-time Status Updates: Frontend displays actual verification results
  • πŸ“Š Tool Gap Detection: Automatic MCP tool suggestions for missing capabilities
  • πŸ“§ Smart Notifications: SNS email alerts for verified TRUE predictions
  • πŸ—‚οΈ Complete Audit Trail: S3 logging with structured JSON for analysis

Future Roadmap (Phase 3+)

  • 🌐 MCP Tool Integration: Weather, sports, and financial API tools
  • πŸ“Š Analytics Dashboard: User statistics and accuracy tracking
  • πŸ“± Mobile Application: React Native mobile app
  • πŸ“’ Social Media Integration: Auto-post successful predictions to Twitter, LinkedIn, Facebook
  • πŸ† Leaderboards: Community prediction accuracy rankings
  • πŸŽ‰ Crying Dashboard: Showcase your successful predictions with social proof

See CHANGELOG.md for detailed version history.

Contributing

When contributing to CalledIt:

  1. Follow the testing requirements in docs/TESTING.md
  2. Ensure all verifiability category tests pass
  3. Update documentation for new features
  4. Maintain the 5-category classification system integrity

Disclaimers

⚠️ DEMONSTRATION PROJECT ONLY

This is a demo/educational project showcasing serverless AI architecture patterns. It is NOT intended for production use.

🚫 Not Production Ready

  • This software is provided for demonstration and educational purposes only
  • DO NOT deploy in production environments without significant additional security review, testing, and hardening
  • No warranties or guarantees are provided regarding security, scalability, or reliability
  • Use entirely at your own risk

πŸ’° AWS Costs Warning

  • This project deploys AWS resources that WILL incur costs
  • You are solely responsible for any AWS charges
  • Monitor your AWS billing dashboard when running this demo
  • Consider using AWS cost alerts and budgets

πŸ”’ Security Notice

  • While security best practices are attempted, this is a demonstration project
  • May contain security vulnerabilities not suitable for production
  • Conduct your own security assessment before any use
  • See SECURITY.md for security considerations

πŸ“‹ Usage Restrictions

This software may NOT be used for:

  • Any illegal activities under applicable law
  • Harassment, abuse, or harm to individuals or organizations
  • Fraud, deception, or misrepresentation
  • Violation of privacy or data protection laws
  • Any malicious or unethical purposes

πŸ›‘οΈ Liability Disclaimer

  • Use at your own risk - no liability accepted for any damages or issues
  • Authors disclaim all warranties and liability
  • Users assume full responsibility for any consequences of use
  • This software is provided "AS IS" without any guarantees

License

This project is licensed under the MIT License with additional disclaimers - see the LICENSE file for details.

This project is part of an educational/research initiative focused on AI-powered prediction verification systems.

About

CalledIt is a serverless web application that converts natural language predictions into structured, verifiable formats using AI agents. Built on AWS serverless architecture, it provides a robust platform for creating, managing, and validating predictions with intelligent verifiability categorization.

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •