Skip to content

Conversation

@connor-tyndall
Copy link
Contributor

Summary

Fixes #20 - Random testing results in duplicate testing (wasting tokens / time)

  • Adds regression_count column to track how many times each feature has been regression tested
  • Replaces random selection with least-tested-first ordering
  • Ensures even distribution of regression testing across all features

Problem

The current feature_get_for_regression uses func.random() to select features:

.order_by(func.random())

This causes:

  • Same features tested repeatedly while others are never tested
  • Wasted tokens on duplicate testing
  • Incomplete regression coverage

Solution

Track regression test frequency and prioritize least-tested features:

  1. New regression_count column - Tracks how many times each feature has been regression tested
  2. Least-tested-first ordering - order_by(regression_count.asc(), id.asc())
  3. Automatic increment - Counter increases each time a feature is selected

Before

Session 1: Features [5, 12, 8]  (random)
Session 2: Features [5, 3, 12]  (random - duplicates!)
Session 3: Features [5, 8, 5]   (random - more duplicates!)

After

Session 1: Features [1, 2, 3]   (count: 0 → 1)
Session 2: Features [4, 5, 6]   (count: 0 → 1)
Session 3: Features [7, 8, 9]   (count: 0 → 1)
...
Session N: Features [1, 2, 3]   (count: 1 → 2, round-robin)

Test plan

  • Verify new databases get regression_count column
  • Verify existing databases are migrated correctly
  • Call feature_get_for_regression multiple times - verify different features returned each time
  • Verify regression_count increments after each call

🤖 Generated with Claude Code

Replaces random selection in feature_get_for_regression with a
least-tested-first approach to ensure even distribution of regression
testing across all features.

Changes:
- Add regression_count column to Feature model to track test frequency
- Add database migration for existing databases
- Update feature_get_for_regression to order by regression_count (ascending)
- Increment regression_count when features are selected for testing

This prevents the same features from being tested repeatedly while others
are never tested, reducing wasted tokens and ensuring comprehensive
regression coverage.

Closes leonvanzyl#20

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@meirm
Copy link

meirm commented Jan 19, 2026

I am using this PR on my server and it works.

sundog75 added a commit to sundog75/autocoder that referenced this pull request Jan 24, 2026
Implements the regression_count column and feature_get_for_regression MCP tool
to ensure even distribution of regression testing across all passing features.

Changes:
- Add regression_count column to Feature model with migration
- Add feature_get_for_regression MCP tool that:
  - Returns passing features ordered by regression_count (ascending)
  - Increments count after selection for round-robin behavior
  - Prevents duplicate testing of same features
- Remove unused RegressionInput class

Based on PR leonvanzyl#47 by connor-tyndall, cleanly reimplemented to avoid merge conflicts.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
rudiheydra added a commit to rudiheydra/AutoBuildr that referenced this pull request Jan 27, 2026
)

Support forbidden_tools blacklist for explicit blocking in addition to
allowed_tools whitelist.

Implementation:
- Add ForbiddenToolBlocked exception class for forbidden tool violations
- Add extract_forbidden_tools() to extract blacklist from tool_policy
- Update ToolPolicyEnforcer to include forbidden_tools field
- Update validate_tool_call() to check forbidden_tools after allowed_tools
- Add create_forbidden_tools_violation() for PolicyViolation creation
- Add record_forbidden_tools_violation() for event recording
- Add get_forbidden_tool_error_message() for clear agent messages
- Update VIOLATION_TYPES to include "forbidden_tools"
- Export new functions from api/__init__.py

All 5 feature steps verified:
1. Extract forbidden_tools from spec.tool_policy - PASS
2. After filtering by allowed_tools, also remove forbidden_tools - PASS
3. Block any tool call to forbidden tool - PASS
4. Record policy violation event - PASS
5. Return clear error message to agent - PASS

Test results:
- tests/test_feature_47_forbidden_tools.py: 47/47 tests PASS
- tests/verify_feature_47.py: 21/21 verification checks PASS

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
rudiheydra added a commit to rudiheydra/AutoBuildr that referenced this pull request Jan 27, 2026
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Random testing results in duplicate testing (wasting tokens / time)

2 participants