Skip to content

Comments

Clarify stuck eval completion criteria#130

Merged
dzianisv merged 2 commits intomainfrom
fix/stuck-eval-complete-guidance
Feb 21, 2026
Merged

Clarify stuck eval completion criteria#130
dzianisv merged 2 commits intomainfrom
fix/stuck-eval-complete-guidance

Conversation

@dzianisv
Copy link
Owner

Summary

  • Clarify COMPLETE guidance in the stuck-detection prompt to avoid marking plans/questions as complete
  • Increase timeout for the invalid-uuid telegram integration test to reduce flakiness

Testing

  • npm test
  • npm run eval:judge
  • npm run eval:stuck
  • npm run eval:compression

@dzianisv dzianisv merged commit 37f9731 into main Feb 21, 2026
0 of 2 checks passed
@dzianisv dzianisv deleted the fix/stuck-eval-complete-guidance branch February 21, 2026 02:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant