Skip to content

Update README with agentrial and Chidori details#39

Open
alepot55 wants to merge 1 commit intoe2b-dev:mainfrom
alepot55:patch-1
Open

Update README with agentrial and Chidori details#39
alepot55 wants to merge 1 commit intoe2b-dev:mainfrom
alepot55:patch-1

Conversation

@alepot55
Copy link

@alepot55 alepot55 commented Feb 6, 2026

AI agents pass benchmarks but fail in production. Why? Single-run evaluations hide variance. agentrial runs your agent N times, computes Wilson confidence intervals, and uses Fisher exact tests to detect regressions in CI/CD. pip install agentrial, write a YAML, done.

Added sections for agentrial and Chidori with descriptions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant