Skip to content

Conversation

@quanru
Copy link
Collaborator

@quanru quanru commented Jan 23, 2026

Summary

This PR refactors the GitHub Actions workflows to use repository variables instead of hardcoded values for model configuration.

Changes

  • Replace hardcoded MIDSCENE_MODEL_NAME value with ${{ vars.MIDSCENE_MODEL_NAME }}
  • Replace hardcoded MIDSCENE_MODEL_FAMILY value with ${{ vars.MIDSCENE_MODEL_FAMILY }}
  • Applied changes across three workflow files:
    • .github/workflows/ai-evaluation.yml
    • .github/workflows/ai-unit-test.yml
    • .github/workflows/ai.yml

Benefits

  • Improved configuration management: Model settings can now be updated centrally through GitHub repository variables
  • Better flexibility: Easy to switch between different models without code changes
  • Consistent approach: Aligns with existing pattern of using variables for MIDSCENE_MODEL_BASE_URL

Replace hardcoded model name and family with GitHub repository variables
to improve configuration management and flexibility across workflows.
@netlify
Copy link

netlify bot commented Jan 23, 2026

Deploy Preview for midscene ready!

Name Link
🔨 Latest commit 07e088c
🔍 Latest deploy log https://app.netlify.com/projects/midscene/deploys/697314875c79520007951ff8
😎 Deploy Preview https://deploy-preview-1850--midscene.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@cloudflare-workers-and-pages
Copy link

cloudflare-workers-and-pages bot commented Jan 23, 2026

Deploying midscene with  Cloudflare Pages  Cloudflare Pages

Latest commit: 07e088c
Status: ✅  Deploy successful!
Preview URL: https://5a2d8dfd.midscene.pages.dev
Branch Preview URL: https://chore-use-github-vars-for-mo.midscene.pages.dev

View logs

Replace deprecated `more_actions_needed_by_instruction` field with
the new `shouldContinuePlanning` and `finalizeSuccess` fields that
were introduced in the planning API refactor.

The test now correctly checks:
- shouldContinuePlanning is true (task not yet complete)
- finalizeSuccess is not true (no successful completion)
- log is defined (AI provided preamble message)

This aligns with the new complete-task tag mechanism.
The cache-functionality tests were failing with timeouts because:
1. Tests used read-only cache strategy
2. CI environment has no pre-existing cache files
3. read-only mode with empty cache causes matchCache to always return
   undefined, leading to AI waiting indefinitely until timeout

Solution: Use read-write strategy which:
- Creates cache if it doesn't exist
- Uses existing cache if available
- Works correctly in both CI and local environments

This fixes test failures in:
- should work with explicit cache ID configuration
- should work with different cache strategies
- should handle cache with performance considerations
The DEBUG=midscene:* environment variable was causing excessive
logging (3000+ lines of AI reasoning output), which led to:
- Log buffer overflow
- Process termination with ELIFECYCLE error
- Truncated test output

Commenting out DEBUG to allow tests to complete normally.
Tests are actually passing - the failure was due to log overflow,
not test failures.
@quanru quanru merged commit af82a82 into main Jan 26, 2026
11 of 14 checks passed
@quanru quanru deleted the chore/use-github-vars-for-model-config branch January 26, 2026 03:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants