Skip to content

Conversation

@d42me
Copy link
Contributor

@d42me d42me commented Jan 13, 2026

Note

Introduces a platform-hosted evaluation flow alongside local runs, with CLI ergonomics and async polling/log streaming.

  • New --hosted mode in prime eval run and deprecated prime env eval; requires owner/name slug and resolves environment ID from hub
  • Adds hosted run options: --poll-interval, --no-stream-logs, --timeout-minutes, --allow-sandbox-access, --allow-instances-access, --custom-secrets, --eval-name
  • Implements utils/hosted_eval.py providing HostedEvalConfig, HostedEvalResult, run_hosted_evaluation (create via /hosted-evaluations, poll /evaluations/{id}, stream logs, fetch final stats) and print_hosted_result
  • Updates env.run_eval to parse JSON args (--env-args, --custom-secrets), create hosted config, run via asyncio, print results, and treat non-COMPLETED as failure; retains existing local eval path and installation behavior

Written by Cursor Bugbot for commit 1d4747c. This will update automatically on new commits. Configure here.

@d42me d42me marked this pull request as ready for review January 15, 2026 15:24
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

console=console,
) as live:
while True:
await asyncio.sleep(poll_interval)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing validation for negative poll interval causes crash

Low Severity

The poll_interval parameter is passed directly to asyncio.sleep() without validation. If a user provides a negative value via --poll-interval, Python raises ValueError: sleep length must be non-negative, resulting in a traceback rather than a graceful CLI error. The parameter is exposed as a CLI option in evals.py without any bounds checking.

Additional Locations (1)

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants