Skip to content

Cross-vendor multi-model debate and consensus engine for AI response distillation

License

Notifications You must be signed in to change notification settings

richardspicer/questionable-ai

Repository files navigation

Questionable AI (qAI)

Python 3.14+ License: MIT pre-commit

Cross-vendor multi-model debate and consensus engine for AI response distillation.

Phase 1 complete — working CLI with cross-vendor debate + reflection rounds. First live 4-vendor debate: 41k tokens across Claude, GPT, Gemini, and Grok.

Sends a user query to multiple AI models simultaneously, shares competing responses back to each model for reflection and critique, then synthesizes a final answer through a user-selected model.

How It Works

  1. Fan out — Query goes to Claude, GPT, Gemini, and Grok via OpenRouter
  2. Reflect — Each model sees the others' responses and argues back
  3. Synthesize — A user-selected model distills the debate into a final answer
  4. Log — Full debate transcript saved as structured JSON

Why Cross-Vendor?

Single-vendor multi-agent systems (Grok's 4-agent debate, Anthropic's agent teams) share the same training data and blind spots. Cross-vendor debate surfaces disagreements that correlated architectures can't — different training data, different safety postures, different failure modes.

Installation

git clone https://github.com/richardspicer/questionable-ai.git
cd questionable-ai
uv sync

Usage

qai ask "Your query here"
qai ask "Your query here" --synthesizer claude
qai ask "Your query here" --rounds 2
qai ask "Your query here" --panel claude,gpt,gemini

questionable-ai also works as the full command name.

Status

Phase 1: Foundation — Complete. Phase 1.5 (provider abstraction for direct vendor APIs) in progress. See Roadmap for the full plan.

Research Platform

Full debate transcripts are logged as structured JSON, enabling analysis of:

  • Disagreement patterns — where do models consistently diverge?
  • Convergence dynamics — how many rounds until consensus? Which models cave first?
  • Consensus poisoning — can a deliberately wrong claim propagate through reflection?
  • Unanimous hallucinations — when all models confidently agree on the wrong answer

Documentation

License

MIT — see LICENSE for details.

Author

Richard Spicer — Security research at MLSecOps Lab

About

Cross-vendor multi-model debate and consensus engine for AI response distillation

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages