Skip to content

Conversation

@quanru
Copy link
Collaborator

@quanru quanru commented Nov 24, 2025

No description provided.

Copilot AI and others added 30 commits October 17, 2025 17:12
* Initial plan

* fix(cli): allow duplicate YAML files in config.yaml

Co-authored-by: quanru <11739753+quanru@users.noreply.github.com>

* fix(cli): deep clone YAML script to prevent mutation issues

* fix(yaml): prevent mutation of flowItem by creating a new object for processing

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: quanru <11739753+quanru@users.noreply.github.com>
Co-authored-by: quanruzhuoxiu <quanruzhuoxiu@gmail.com>
….x (#1325)

* refactor(core): remove non-OpenAI SDK support and upgrade to OpenAI 6.x

This commit removes support for Anthropic SDK and Azure OpenAI, simplifying
the codebase to use only the standard OpenAI SDK with OpenAI-style APIs.

Changes:
- Remove Anthropic SDK (@anthropic-ai/sdk) dependency
- Remove Azure OpenAI specific code and @azure/identity dependency
- Remove langsmith wrapper support
- Remove proxy agent support (https-proxy-agent, socks-proxy-agent)
- Upgrade OpenAI SDK from 4.81.0 to 6.3.0
- Simplify createChatClient function to only create standard OpenAI clients
- Remove 'style' parameter from createChatClient return type
- Remove all Anthropic-specific message handling code
- Add openai 6.3.0 as devDependency to @midscene/shared

Benefits:
- Cleaner, more maintainable codebase
- Reduced dependencies (removed 5 packages)
- All AI providers can now be accessed through OpenAI-compatible APIs

Breaking Changes:
- Anthropic SDK mode no longer supported
- Azure OpenAI specific configuration removed
- MIDSCENE_LANGSMITH_DEBUG no longer supported
- httpAgent/socksProxy removed from createChatClient

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* refactor(core): model provider documentation and remove Azure and Anthropic configurations

* Apply suggestion from @Copilot

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* feat(core): add proxy support for OpenAI client with HTTP and SOCKS configurations

* feat(core): add qwen-vl specific configuration for high resolution images

---------

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: yuyutaotao <167746126+yuyutaotao@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
This change ensures that Planning functionality only supports vision
language models (VL mode) and removes DOM-based planning support.

Changes:
- Add validation in ModelConfigManager.getModelConfig() to require
  VL mode for Planning intent
- Remove DOM mode logic from llm-planning.ts (describeUserPage,
  markupImageForLLM)
- Simplify image processing to only support VL mode paths
- Add comprehensive JSDoc documentation for Planning VL mode
  requirement
- Add 6 new unit tests covering Planning VL mode validation in both
  isolated and normal modes
- Fix existing tests to provide VL mode for Planning intent

Breaking Change:
- Planning without VL mode configured will now throw an error with
  clear instructions
- Error message includes all supported VL modes and configuration
  examples

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude <noreply@anthropic.com>
* chore(core): remove warning msg for gpt-4

* chore(core): remove dom-based locator
* chore(core): refine recorder loop

* feat(core): update implementation of recorder
* refactor(core,web-integration,docs): rename API methods for clarity

BREAKING CHANGE: Renamed aiAction() to aiAct() and logScreenshot()
to recordToReport() for improved naming consistency. The aiAction()
method is kept as deprecated for backward compatibility.

Changes:
- Renamed aiAction() to aiAct() across core and web-integration
- Renamed logScreenshot() to recordToReport()
- Updated all English and Chinese documentation
- Updated code examples in README files
- Updated Playwright fixture to support new method names
- Added deprecation warning for aiAction() method
- Updated all test files and examples

This improves API consistency and clarity while maintaining
backward compatibility through deprecated methods.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* feat(yaml): add backward compatibility for aiAction method in YAML flow

* fix(core): conditionally add httpAgent to OpenAI client options

Fix TypeScript compilation error where httpAgent property doesn't
exist in OpenAI 6.x ClientOptions type. Only include httpAgent
when a proxy is configured, and use type assertion to bypass the
strict type check.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
* chore(core): update implementation of insight

* chore(core): refine error plan

* chore(core): refine error plan

* chore(core): split tasks into multiple parts

* fix(core): fix ci
* chore(release): upgrade all packages to v1.0.0

- Bump version from 0.30.4 to 1.0.0 for all packages
- Update Chrome extension manifest version to 0.136
- Update internal package dependencies to 1.0.0

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* feat(release): add validation to prevent 1.x stable releases

- Block publishing of 1.x versions with 'latest' tag
- Allow publishing 1.x beta versions (prepatch)
- Allow publishing stable versions for other major versions (0.x, 2.x, etc.)

This ensures that 1.x releases can only be published as beta versions,
preventing accidental stable releases while still allowing testing and
pre-release distributions.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
* refactor(core): remove unused getXpathsById method

This method was not being used in the codebase. Removed:
- Core implementation in shared/src/extractor/locator.ts
- Export from shared/src/extractor/index.ts
- Implementations in puppeteer/base-page.ts, chrome-extension/page.ts, and static/static-page.ts
- All related unit tests

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* refactor(types): rename AndroidPullParam and AndroidLongPressParam to PullParam and LongPressParam

---------

Co-authored-by: Claude <noreply@anthropic.com>
…1341)

* feat(core): support custom OpenAI client instances for observability

Enable users to provide custom OpenAI client factory function through
AgentOpt.createOpenAIClient, allowing integration with observability
tools like langsmith and langfuse.

Key changes:
- Add CreateOpenAIClientFn type in @midscene/shared/env for creating
  custom OpenAI clients
- Extend AgentOpt interface with optional createOpenAIClient callback
- Pass callback through Agent -> ModelConfigManager -> IModelConfig
- Inject createOpenAIClient during config initialization for better
  performance
- Update createChatClient to use custom client factory when provided

Benefits:
- Users can wrap OpenAI clients with langsmith's wrapOpenAI() for
  tracing
- Users can wrap with langfuse's observeOpenAI() for logging
- Support different clients for different intents (planning, grounding,
  VQA, default)
- Zero runtime overhead - injection happens during config initialization

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* test(core): add unit tests for custom OpenAI client integration in ModelConfigManager and service-caller

* Update packages/shared/tests/unit-test/env/modle-config-manager.test.ts

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* refactor(core): remove unused MIDSCENE_API_TYPE constant from service-caller and types

---------

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* chore(ci): enable workflows for PRs targeting 1.0 branch

Add 1.0 branch to pull_request triggers in CI and lint workflows to ensure
PRs targeting the 1.0 branch run the same checks as PRs to main.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* tests(shared, web-integration): update tests to use runner instead of executor and improve environment setup

---------

Co-authored-by: Claude <noreply@anthropic.com>
* docs(awesome): add midscene java sdk (#1324)

* fix(core): support number type for aiInput value field (#1339)

* fix(core): support number type for aiInput value field

This change allows aiInput.value to accept both string and number types,
addressing scenarios where:
1. AI models return numeric values instead of strings
2. YAML files contain unquoted numbers that parse as number type

Changes:
- Updated type definitions to accept string | number
- Added Zod schema transformation to convert numbers to strings
- Updated runtime validation to accept both types
- Added explicit conversion in YAML player as fallback

All conversions happen internally and are transparent to users.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* fix(core): update aiInput type signatures to accept number values

Update the TypeScript method signatures for aiInput to accept
string | number for the value parameter, matching the runtime
implementation.

Changes:
- New signature: opt parameter now accepts { value: string | number }
- Legacy signature: first parameter now accepts string | number
- Implementation signature: locatePromptOrValue now accepts TUserPrompt | string | number
- Type assertion updated from `as string` to `as string | number`

This ensures type safety and allows users to pass number values
directly without TypeScript errors, while maintaining backward
compatibility with existing string-based usage.

Fixes type errors in test cases that use number values.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>

* fix(report): prevent sidebar jitter when expanding case selector (#1344)

Fixed sidebar shifting 1-2 pixels when clicking to expand the
playwright case selector. The issue was caused by adding a border
only in the expanded state, causing a sudden height change.

Solution: Added transparent border to the collapsed state, ensuring
consistent height across both states.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude <noreply@anthropic.com>

* refactor(core): unify cache config parameters (#1346)

Simplified `processCacheConfig` function signature from 3 to 2 parameters.
Unified `fallbackId` and `cacheId` into single `cacheId` parameter.

BREAKING CHANGE: processCacheConfig signature changed

Changed from:
  processCacheConfig(cache, fallbackId, cacheId?)
To:
  processCacheConfig(cache, cacheId)

The cacheId parameter now serves dual purpose:
1. Fallback ID when cache is true or cache object lacks ID
2. Legacy cacheId when cache is undefined (requires MIDSCENE_CACHE env)

Updated call sites:
- packages/core/src/agent/agent.ts
- packages/web-integration/src/playwright/ai-fixture.ts
- packages/cli/src/create-yaml-player.ts (4 locations)

Added comprehensive test coverage for legacy compatibility mode:
- process-cache-config.test.ts: 18 tests passing
- create-yaml-player.test.ts: 13 tests passing (6 new)
- playwright-ai-fixture-cache.test.ts: 8 tests passing (3 new)

Benefits:
- Simpler API with fewer parameters
- Unified semantics for new and legacy use cases
- Full backward compatibility maintained
- Better test coverage

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude <noreply@anthropic.com>

* fix(core,web-integration): fix unit tests after merging main branch

This commit fixes unit test failures that occurred after merging the
main branch into the 1.0 branch. The issues were caused by temporal
conflicts between commits that added new features and subsequent
refactoring.

Root Cause:
- Commit 13b4f1d added aiInput number support with tests using
  'executor'
- Commit c9b385b refactored Executor → TaskRunner in the 1.0 branch
- When main was merged, tests still referenced 'executor' but code
  used 'runner'

Changes:
1. Fix YAML player aiInput number conversion
   (packages/core/src/yaml/player.ts):
   - Extract 'value' field separately to prevent spread override
   - Ensure number values are converted to strings via String(value)
   - Maintain backward compatibility for empty string handling

2. Fix test mock structure
   (packages/web-integration/tests/unit-test/ai-input-number-value.test.ts):
   - Update all mock objects from 'executor' to 'runner'
   - Aligns with TaskRunner API refactoring

3. Fix cache config test
   (packages/web-integration/tests/unit-test/playwright-ai-fixture-cache.test.ts):
   - Move vi.mock() before imports to ensure proper module hoisting
   - Fixes legacy mode environment variable checks

4. Add value conversion in agent.ts (optional improvement):
   - Explicitly convert number to string in aiInput method
   - Improves code clarity and test stability

All tests now pass (195 passed, 1 skipped).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: yuyutaotao <167746126+yuyutaotao@users.noreply.github.com>
Co-authored-by: Claude <noreply@anthropic.com>
* chore(core): update types of task executor

* chore(core): update sleep tasks

* chore(core): update types for planning

* feat(core): update subTask flag
* chore(lint): fix linting and formatting issues

- Fix useless switch case in modle-config-manager.test.ts
- Format package.json files for consistency
- Apply code formatting across core, agent, and related files

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* chore(deps): update openai package to version 6.3.0

---------

Co-authored-by: Claude <noreply@anthropic.com>
* feat(chrome-extension): enable hot reload for development

This commit adds hot reload support for chrome-extension development,
significantly improving the development experience.

Main changes:
- Add web-ext integration for automatic extension reloading
- Add wait-for-build.js script to ensure build completes first
- Update dev script to use concurrently for build watch + web-ext
- Add web-ext-config.cjs for web-ext configuration

To fix build stability during hot reload:
- Replace npm-watch with rslib native watch mode in visualizer
- Standardize dev/build:watch script relationship across packages
- This prevents dist directory deletion during rebuilds

The rslib native watch mode performs incremental builds without
deleting the dist directory, preventing "Module not found" errors
when chrome-extension references @midscene/visualizer.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* fix(chrome-extension): wait for JS bundles before starting web-ext

The previous implementation only checked for static files (manifest.json,
index.html) which are copied early in the build process. This caused web-ext
to start before the JavaScript bundles were built, resulting in errors.

Now we check for the actual build outputs:
- dist/static/js/index.js
- dist/static/js/popup.js

This ensures web-ext only starts after Rsbuild has completed the full build.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* chore(deps): align Rsbuild plugin versions across workspace

Update all Rsbuild plugins to use consistent versions:
- @rsbuild/plugin-less: 1.5.0
- @rsbuild/plugin-node-polyfill: 1.4.2
- @rsbuild/plugin-react: 1.4.1
- @rsbuild/plugin-svgr: 1.2.2
- @rsbuild/plugin-type-check: 1.2.4

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
* docs(awesome): add midscene java sdk (#1324)

* fix(core): support number type for aiInput value field (#1339)

* fix(core): support number type for aiInput value field

This change allows aiInput.value to accept both string and number types,
addressing scenarios where:
1. AI models return numeric values instead of strings
2. YAML files contain unquoted numbers that parse as number type

Changes:
- Updated type definitions to accept string | number
- Added Zod schema transformation to convert numbers to strings
- Updated runtime validation to accept both types
- Added explicit conversion in YAML player as fallback

All conversions happen internally and are transparent to users.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* fix(core): update aiInput type signatures to accept number values

Update the TypeScript method signatures for aiInput to accept
string | number for the value parameter, matching the runtime
implementation.

Changes:
- New signature: opt parameter now accepts { value: string | number }
- Legacy signature: first parameter now accepts string | number
- Implementation signature: locatePromptOrValue now accepts TUserPrompt | string | number
- Type assertion updated from `as string` to `as string | number`

This ensures type safety and allows users to pass number values
directly without TypeScript errors, while maintaining backward
compatibility with existing string-based usage.

Fixes type errors in test cases that use number values.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>

* fix(report): prevent sidebar jitter when expanding case selector (#1344)

Fixed sidebar shifting 1-2 pixels when clicking to expand the
playwright case selector. The issue was caused by adding a border
only in the expanded state, causing a sudden height change.

Solution: Added transparent border to the collapsed state, ensuring
consistent height across both states.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude <noreply@anthropic.com>

* refactor(core): unify cache config parameters (#1346)

Simplified `processCacheConfig` function signature from 3 to 2 parameters.
Unified `fallbackId` and `cacheId` into single `cacheId` parameter.

BREAKING CHANGE: processCacheConfig signature changed

Changed from:
  processCacheConfig(cache, fallbackId, cacheId?)
To:
  processCacheConfig(cache, cacheId)

The cacheId parameter now serves dual purpose:
1. Fallback ID when cache is true or cache object lacks ID
2. Legacy cacheId when cache is undefined (requires MIDSCENE_CACHE env)

Updated call sites:
- packages/core/src/agent/agent.ts
- packages/web-integration/src/playwright/ai-fixture.ts
- packages/cli/src/create-yaml-player.ts (4 locations)

Added comprehensive test coverage for legacy compatibility mode:
- process-cache-config.test.ts: 18 tests passing
- create-yaml-player.test.ts: 13 tests passing (6 new)
- playwright-ai-fixture-cache.test.ts: 8 tests passing (3 new)

Benefits:
- Simpler API with fewer parameters
- Unified semantics for new and legacy use cases
- Full backward compatibility maintained
- Better test coverage

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude <noreply@anthropic.com>

* release: v0.30.5

* docs(site): optimize v0.30 changelog with user-focused improvements (#1352)

Improved the v0.30 changelog to be more user-centric and less promotional:

- Reduced hyperbolic language ("comprehensive upgrade" → "improved", etc.)
- Reorganized content structure with clearer user value sections
- Added specific usage scenarios and examples for cache strategies
- Enhanced mobile platform sections with iOS and Android subsections
- Simplified technical descriptions to be more objective
- Added cross-platform consistency section for ClearInput feature
- Translated optimized content to English version

These changes make the changelog more professional and easier for users
to understand the actual benefits of the update.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude <noreply@anthropic.com>

* fix(ios): correct horizontal scroll direction and improve swipe implementation (#1358)

* fix(ios): correct horizontal scroll direction and improve swipe implementation

Fixed two issues with iOS horizontal scrolling:

1. **Corrected scroll direction semantics**
   - scrollLeft now swipes right (brings left content into view)
   - scrollRight now swipes left (brings right content into view)
   - This aligns with Android and Web scroll behavior where the
     direction indicates which content enters the viewport

2. **Improved swipe implementation**
   - Implemented W3C Actions API for better scroll support
   - Falls back to dragfromtoforduration if Actions API fails
   - Increased scroll distance from width/3 to width*0.7 (70%)
     to prevent bounce-back

3. **Fixed scrollUntilBoundary directions**
   - Corrected left/right swipe directions in boundary detection

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* refactor(ios): remove fallback from swipe method, use W3C Actions API only

---------

Co-authored-by: Claude <noreply@anthropic.com>

* feat(android-playground): enable alwaysFetchScreenInfo for AndroidDevice (#1363)

* fix(docs): add alwaysFetchScreenInfo parameter to AndroidDevice constructor documentation

* feat(android-playground): enable alwaysFetchScreenInfo for AndroidDevice

Configure AndroidDevice instance with alwaysFetchScreenInfo option
set to true to ensure screen information is always fetched during
device operations.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* fix(android): rename alwaysFetchScreenInfo to alwaysRefreshScreenInfo for consistency

---------

Co-authored-by: Claude <noreply@anthropic.com>

* fix(core): handle ZodEffects and ZodUnion in schema parsing (#1359)

* fix(core): handle ZodEffects and ZodUnion in schema parsing

- Add support for ZodEffects (transformations) in getTypeName and getDescription
- Add support for ZodUnion types with proper type display (type1 | type2)
- Fixes "failed to parse Zod type" warning on first execution with caching

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* test(core): add tests for descriptionForAction with ZodEffects and ZodUnion

* chore(core): update test cases

---------

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: yutao <yutao.tao@bytedance.com>

* feat(playground): implement task cancellation for Android/iOS playgrounds (#1355)

* feat(playground): implement task cancellation for Android/iOS playgrounds

This PR implements task cancellation functionality for Android and iOS
playgrounds using a singleton + recreation pattern.

When users clicked the "Stop" button in Android/iOS playground, the task
continued to execute and control the device via ADB commands. This was
because:
- Agent instances were global singletons created at server startup
- The /cancel endpoint only deleted progress tips without stopping execution
- There was no mechanism to interrupt ongoing tasks

Implemented a singleton + recreation pattern:
- PlaygroundServer now accepts factory functions instead of instances
- Added task locking mechanism (currentTaskId) to prevent concurrent tasks
- When cancel is triggered, the agent is destroyed and recreated
- Device operations stop immediately as destroyed agents reject new commands

1. **PlaygroundServer** (packages/playground/src/server.ts)
   - Added factory function support for page and agent creation
   - Added `recreateAgent()` method to destroy and recreate agent
   - Added `currentTaskId` to track running tasks
   - Enhanced `/execute` endpoint with task conflict detection
   - Enhanced `/cancel` endpoint to recreate agent on cancellation
   - Backward compatible with existing instance-based usage

2. **Android Playground** (packages/android-playground/src/bin.ts)
   - Updated to use factory pattern for server creation
   - Each recreation creates fresh AndroidDevice and AndroidAgent instances

3. **iOS Playground** (packages/ios/src/bin.ts)
   - Updated to use factory pattern for server creation
   - Each recreation creates fresh IOSDevice and IOSAgent instances

- Added test script `test-cancel-android.sh` for automated testing
- Manual testing confirmed device operations stop when cancel is triggered

```
User clicks Stop
  ↓
Frontend calls /cancel/:requestId
  ↓
Server checks if current running task
  ↓
Call recreateAgent()
  ├─ Destroy old agent (agent.destroy())
  ├─ Destroy old device (device.destroy())
  ├─ Create new device (pageFactory())
  └─ Create new agent (agentFactory(device))
  ↓
Clear task lock and progress tips
  ↓
Device stops operations ✅
```

- ✅ Simple implementation (minimal code changes)
- ✅ Effective cancellation (destroy() immediately sets destroyed flag)
- ✅ Backward compatible (still accepts instances)
- ✅ Natural serialization (one task at a time per device)

```bash
pnpm run android:playground

./test-cancel-android.sh
```

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* fix(page): ensure keyboard actions return promises for better async handling

* refactor(playground): update PlaygroundServer to use agent factories and simplify server creation

* fix(ios): round coordinates for tap and swipe actions to improve accuracy

* fix(android): round coordinates in scrolling and gesture methods for improved accuracy

* refactor(playground): simplify PlaygroundServer instantiation and improve code readability

---------

Co-authored-by: Claude <noreply@anthropic.com>

* fix(yaml): skip environment variable interpolation in YAML comments (#1361)

* Initial plan

* fix(yaml): skip environment variable interpolation in YAML comments

* style(yaml): apply biome linting fixes

Co-authored-by: quanru <11739753+quanru@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: quanru <11739753+quanru@users.noreply.github.com>

* fix(core): handle null data in WaitFor and support array keyName in KeyboardPress (#1354)

* fix(core): handle null data in WaitFor and support array keyName in KeyboardPress

This commit fixes two critical bugs:

1. **Fix null data handling in task execution**
   - Fixed TypeError when AI extract() returns null for WaitFor operations
   - Added null/undefined check before accessing data properties
   - WaitFor operations now return false when data is null (condition not met)
   - Other operations (Assert, Query, String, Number) return null when data is null
   - Location: src/agent/tasks.ts:936-938

2. **Add array support for keyName in KeyboardPress**
   - Updated actionKeyboardPressParamSchema to accept string | string[]
   - Allows key combinations like ['Control', 'A'] for keyboard shortcuts
   - Maintains backward compatibility with string format
   - Updated type definitions in aiKeyboardPress method
   - Locations:
     - src/device/index.ts:197-199
     - src/agent/agent.ts:575-622

**Test Coverage:**
- Added comprehensive unit tests for null data handling (8 test cases)
- Added unit tests for keyName array validation (7 test cases)
- All tests verify edge cases and expected behavior

Fixes issue where executor crashed with:
"TypeError: Cannot read properties of null (reading 'StatementIsTruthy')"

And fixes parameter validation error:
"Invalid parameters for action KeyboardPress: Expected string, received array"

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* fix(ios,android): handle array keyName in KeyboardPress action

- Updated iOS and Android device implementations to handle keyName as string | string[]
- For mobile devices, array keys are joined with '+' (e.g., ['Control', 'A'] becomes 'Control+A')
- This fixes TypeScript compilation errors in iOS and Android packages
- Maintains backward compatibility with string format

Related to the KeyboardPress array support added in the previous commit.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* refactor(ios,android): improve KeyboardPress array handling

- Remove incorrect join('+') approach that doesn't work on mobile devices
- Use last key from array instead (e.g., ['Control', 'A'] → 'A')
- Add clear warning messages when array input is used on mobile platforms
- Mobile devices don't support keyboard combinations, this is a graceful degradation

This makes the behavior more predictable and provides better feedback to developers.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* test(core): fix TaskExecutor constructor arguments in null data tests

- Fixed TaskExecutor constructor call to match actual signature
- Constructor requires (interface, insight, options) instead of (insight, interface)
- All 8 tests now passing

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* fix(ios,android): improve logging for unsupported key combinations in device input

* fix(core): handle null data in WaitFor and improve keyName parameter description

This commit fixes the null data handling bug and improves the KeyboardPress parameter description.

## Changes:

### 1. Fix null data handling in task execution
- Fixed TypeError when AI extract() returns null for WaitFor operations
- Added null/undefined check before accessing data properties (tasks.ts:936-938)
- WaitFor operations now return false when data is null (condition not met)
- Other operations (Assert, Query, String, Number) return null when data is null

### 2. Improve KeyboardPress parameter description
- Reverted keyName to only accept string type (not array)
- Added clear description: "Use '+' for key combinations, e.g., 'Control+A', 'Shift+Enter'"
- This provides better guidance to AI for generating key combinations
- Simplified iOS/Android implementations (no special array handling needed)

### 3. Test coverage
- Added 8 unit tests for null data handling
- Updated KeyboardPress tests to validate string-only format
- Added test for key combination strings (e.g., 'Control+A')
- Added test to verify arrays are rejected
- Fixed unused variable warning in test file

## Fixed Issues:

**Issue 1:** Executor crashes with null data
```
TypeError: Cannot read properties of null (reading 'StatementIsTruthy')
```

**Issue 2:** Unclear how to specify key combinations
- Now clearly documented in parameter description with examples

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* docs(core): align KeyboardPress action description with parameter schema

Updated the KeyboardPress action description to explicitly mention
support for key combinations (e.g., "Control+A", "Shift+Enter"),
making it consistent with the keyName parameter description that
already documented this functionality.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* fix(core): handle null and undefined data in WaitFor output processing

---------

Co-authored-by: Claude <noreply@anthropic.com>

* perf(android): optimize clearInput performance by batching keyevents (#1366)

* perf(android): optimize clearInput performance by batching keyevents

Replace serial keyevent(67) calls with clearTextField() method from
appium-adb library, which batches all keyevents into a single shell command.

Performance improvement:
- Before: ~50 seconds (100 sequential shell calls, ~500ms each)
- After: ~1-2 seconds (single batched shell command)
- Speedup: 25-50x

Changes:
- Use adb.clearTextField(100) instead of repeat(() => adb.keyevent(67))
- Add clearTextField mock to unit tests for compatibility

All 75 unit tests passing, build successful.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* fix(android): include device pixel ratio in size calculation for AndroidDevice

---------

Co-authored-by: Claude <noreply@anthropic.com>

* release: v0.30.6

* fix(tests): enhance null data handling tests by adding uiContext parameter

---------

Co-authored-by: yuyutaotao <167746126+yuyutaotao@users.noreply.github.com>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: yutao <yutao.tao@bytedance.com>
Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com>
Co-authored-by: quanru <11739753+quanru@users.noreply.github.com>
…ication (#1365)

* feat(bridge-mode): add remote access support for cross-machine communication

This commit implements remote access capability for Bridge Mode,
enabling communication between server and client on different machines.

## Changes

### Core Features
- Server side: Added `allowRemoteAccess` option to bind server to 0.0.0.0
- Server side: Added `host` and `port` options for custom configuration
- Client side: Added server URL configuration UI in Chrome extension
- Configuration priority: host > allowRemoteAccess > default (127.0.0.1)

### Modified Files
- packages/web-integration/src/bridge-mode/:
  - common.ts: Added getBridgeServerHost() helper function
  - io-server.ts: Modified to support custom host binding
  - agent-cli-side.ts: Added remote access options to constructor
  - page-browser-side.ts: Added server endpoint parameter support

- apps/chrome-extension/src/:
  - extension/bridge/index.tsx: Added server URL configuration UI
  - extension/bridge/index.less: Added styles for configuration section
  - utils/bridgeConnector.ts: Support custom server endpoint

- packages/web-integration/tests/:
  - ai/bridge/remote-access.test.ts: Added comprehensive tests
  - unit-test/bridge/io.test.ts: Updated tests for new API

### Documentation
- Updated docs in apps/site/docs/{en,zh}/bridge-mode-by-chrome-extension.mdx
- Added remote access configuration section with examples
- Added security warnings for remote access usage

## API Changes

New constructor options:
- allowRemoteAccess: Enable remote access
- host: Custom host (optional)
- port: Custom port (optional)

## Backward Compatibility
- All existing code works without modification
- Default behavior unchanged (localhost only)
- All unit tests passing

## Security
- Default remains secure (127.0.0.1 only)
- Remote access requires explicit opt-in
- Documentation includes security warnings

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* fix(bridge): resolve race condition in server initialization

Fix the 'xhr poll error' by ensuring all Socket.IO middleware and event
handlers are set up BEFORE calling httpServer.listen(). This eliminates
the race condition where clients could attempt to connect before the
server was fully ready.

Changes:
- Moved Socket.IO middleware setup before httpServer.listen()
- Moved Socket.IO connection handlers before httpServer.listen()
- Moved httpServer.listen() to the end of initialization sequence

Fixes failing unit tests in packages/web-integration/tests/unit-test/
bridge/io.test.ts (all 15 tests now passing)

* fix(web-integration): add delay to ensure Socket.IO is fully ready in server initialization

* fix(bridge-server): improve HTTP server setup and event handling order

* fix(bridge): improve server URL handling and localStorage management

* feat(bridge): enhance server configuration UI with expandable section and improved styling

* Update packages/web-integration/tests/ai/bridge/remote-access.test.ts

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update packages/web-integration/tests/ai/bridge/remote-access.test.ts

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update packages/web-integration/tests/ai/bridge/remote-access.test.ts

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update packages/web-integration/tests/ai/bridge/remote-access.test.ts

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
#1377)

## Problem
The previous nano-staged configuration had two issues:
1. Used `biome check .` which checked the entire project instead of only staged files
2. nano-staged doesn't automatically re-stage fixed files, causing commits to fail

## Solution
Switched to lint-staged which:
- Automatically passes only staged files to biome
- Re-stages files after fixes are applied
- More mature and widely adopted

## Changes
- Replaced nano-staged with lint-staged in pre-commit hook
- Updated biome command to remove project-wide checks
- Added lint-staged as dev dependency

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude <noreply@anthropic.com>
* feat(yaml): support all device options in YAML configuration

This PR enables YAML scripts to use all Android and iOS device options
by centralizing device option types and ensuring runtime configuration
propagation.

Changes:
- Created packages/core/src/device/device-options.ts to centralize all
  device option type definitions (AndroidDeviceOpt, IOSDeviceOpt)
- Updated MidsceneYamlScriptAndroidEnv and MidsceneYamlScriptIOSEnv to
  extend device options using Omit<> to exclude programmatic fields
- Fixed runtime configuration passing in create-yaml-player.ts to
  forward all YAML config options to device constructors
- Simplified agent creation functions to pass entire options object
  instead of manually listing each parameter

YAML scripts can now configure:

Android:
- androidAdbPath, remoteAdbHost, remoteAdbPort
- imeStrategy, displayId, usePhysicalDisplayIdForScreenshot
- screenshotResizeScale, alwaysFetchScreenInfo
- autoDismissKeyboard, keyboardDismissStrategy

iOS:
- deviceId, useWDA, wdaPort, wdaHost
- autoDismissKeyboard

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* test(yaml): add unit tests for device options propagation

Add comprehensive unit tests to verify that all device options are
correctly passed from YAML configuration to device constructors.

Tests include:
- Android device options propagation from YAML to agentFromAdbDevice
- iOS device options propagation from YAML to agentFromWebDriverAgent
- Type definitions for AndroidDeviceOpt and IOSDeviceOpt
- YAML environment types (MidsceneYamlScriptAndroidEnv, MidsceneYamlScriptIOSEnv)
- Validation that customActions is excluded from YAML types
- IME strategy and keyboard dismiss strategy type validations
- Minimal and full configuration scenarios

All 31 tests passing (17 in CLI, 14 in Core).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* fix(android): ensure empty object is passed when opts is undefined

Fix failing unit tests by ensuring an empty object is passed to
AndroidDevice and IOSDevice constructors when opts is undefined,
maintaining backward compatibility with existing tests.

Changes:
- Updated agentFromAdbDevice to pass opts || {} to AndroidDevice
- Updated agentFromWebDriverAgent to pass opts || {} to IOSDevice

This ensures the constructors always receive an object instead of
undefined, which is what the existing tests expect.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* fix(device-options): rename alwaysFetchScreenInfo to alwaysRefreshScreenInfo for clarity

* docs(site): update Android and iOS sections to include all configuration options from their respective constructors

---------

Co-authored-by: Claude <noreply@anthropic.com>
Update the task type display names in report sidebar and detail views:
- Change "Insight / Query" and "Insight / Assert" to "Insight"
- Change "Action / {subType}" to "Action Space / {subType}"
- Show "Planning / Plan" instead of just "Planning"
- Keep other task types unchanged (e.g., "Planning / Locate")

This provides clearer and more consistent naming for different task
types in the report UI, making it easier to understand the task
hierarchy and categorization.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude <noreply@anthropic.com>
…1381)

This change improves code consistency by using clonedYamlScript.agent
instead of mixing yamlScript.agent and clonedYamlScript for other
properties throughout the agent initialization code.

Changes:
- Use clonedYamlScript.agent consistently across all agent types
  (puppeteer, bridge mode, Android, iOS, and interface)
- This ensures all configuration comes from the same cloned instance,
  preventing potential mutation issues when the same YAML file is
  executed multiple times
- Added comprehensive unit tests to verify aiActionContext is properly
  passed to Android, iOS, and bridge mode agents

This is a code quality improvement that makes the codebase more
maintainable and aligns with the original design intent of using
structuredClone to isolate each ScriptPlayer instance.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude <noreply@anthropic.com>
…1375)

* refactor(env): modernize model configuration environment variables

This PR refactors the model configuration system with improved naming conventions
and better type safety while maintaining backward compatibility.

Key Changes:

1. Environment Variable Naming Convention Updates:
   - Renamed OPENAI_* → MODEL_* for public API variables
     * OPENAI_API_KEY → MODEL_API_KEY (deprecated, backward compatible)
     * OPENAI_BASE_URL → MODEL_BASE_URL (deprecated, backward compatible)
   - Renamed MIDSCENE_*_VL_MODE → MIDSCENE_*_LOCATOR_MODE across all intents
     * MIDSCENE_VL_MODE → MIDSCENE_LOCATOR_MODE
     * MIDSCENE_VQA_VL_MODE → MIDSCENE_VQA_LOCATOR_MODE
     * MIDSCENE_PLANNING_VL_MODE → MIDSCENE_PLANNING_LOCATOR_MODE
     * MIDSCENE_GROUNDING_VL_MODE → MIDSCENE_GROUNDING_LOCATOR_MODE
   - Updated all internal MIDSCENE_*_OPENAI_* → MIDSCENE_*_MODEL_*
     * MIDSCENE_VQA_OPENAI_API_KEY → MIDSCENE_VQA_MODEL_API_KEY
     * MIDSCENE_PLANNING_OPENAI_API_KEY → MIDSCENE_PLANNING_MODEL_API_KEY
     * MIDSCENE_GROUNDING_OPENAI_API_KEY → MIDSCENE_GROUNDING_MODEL_API_KEY
     * (and corresponding BASE_URL variables)

2. Type System Improvements:
   - Split TModelConfigFn into public and internal types
   - Public API (TModelConfigFn) no longer exposes 'intent' parameter
   - Internal type (TModelConfigFnInternal) maintains intent parameter
   - Users can still optionally use intent parameter via type casting

3. Backward Compatibility:
   - Maintained compatibility for documented public variables (OPENAI_API_KEY, OPENAI_BASE_URL)
   - New variables take precedence, fallback to legacy names if not set
   - Only public documented variables are deprecated, internal variables renamed directly

4. Updated Files:
   - packages/shared/src/env/types.ts - Type definitions and constants
   - packages/shared/src/env/constants.ts - Config key mappings
   - packages/shared/src/env/decide-model-config.ts - Compatibility logic
   - packages/shared/src/env/model-config-manager.ts - Type casting implementation
   - packages/shared/src/env/init-debug.ts - Debug variable updates
   - All test files updated to use new variable names

Testing:
- All 24 model-config-manager tests passing
- Overall test suite: 241 tests passing

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Update packages/shared/src/env/constants.ts

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* test(env): add comprehensive backward compatibility tests for OPENAI_* variables

- Added test suite to verify MODEL_API_KEY/MODEL_BASE_URL take precedence
- Added test to ensure OPENAI_API_KEY/OPENAI_BASE_URL still work as fallback
- Fixed compatibility logic to prioritize new variables over legacy ones
- All 13 tests passing, including 5 new backward compatibility tests

Test coverage:
✓ Using only legacy variables (OPENAI_API_KEY)
✓ Using only new variables (MODEL_API_KEY)
✓ Mixing new and legacy variables (new takes precedence)
✓ Individual precedence for API_KEY and BASE_URL

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* fix(test): reset MIDSCENE_CACHE in beforeEach to avoid .env interference

The test 'should return the correct value from override' was failing because
.env file sets MIDSCENE_CACHE=1. This was polluting the test environment and
causing the test to expect false but receive true.

Fixed by explicitly resetting MIDSCENE_CACHE to empty string in beforeEach.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* docs(site): update environment variable names and add advanced configuration examples for agents

---------

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* refactor(core): remove tree info in uiContext

* chore(core): fix lint

* chore(core): remove dom-based locator

* fix(core): test cases

* chore(core): fix lint

* fix(core): test cases
* feat(core): update signature of warp-openai

* docs(site): update createOpenAIClient API documentation

Update the documentation for createOpenAIClient to reflect the new signature:
- Changed from factory function to wrapper function
- Now receives base OpenAI instance and options
- Returns Promise<OpenAI | undefined>
- Updated examples to show async wrapper pattern
- Removed unnecessary OpenAI import from examples

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: quanruzhuoxiu <quanruzhuoxiu@gmail.com>
Co-authored-by: Claude <noreply@anthropic.com>
…nt variables (#1388)

Add backward compatibility support for legacy MIDSCENE_OPENAI_* environment variables:
- MIDSCENE_OPENAI_INIT_CONFIG_JSON (now MIDSCENE_MODEL_INIT_CONFIG_JSON)
- MIDSCENE_OPENAI_HTTP_PROXY (now MIDSCENE_MODEL_HTTP_PROXY)
- MIDSCENE_OPENAI_SOCKS_PROXY (now MIDSCENE_MODEL_SOCKS_PROXY)

Changes:
- Add deprecated constants to types.ts with @deprecated tags
- Add legacy variables to MODEL_ENV_KEYS for overrideAIConfig support
- Update DEFAULT_MODEL_CONFIG_KEYS_LEGACY to use legacy variable names
- Implement priority fallback logic in decide-model-config.ts (new variables take precedence)
- Update documentation (zh/en model-provider.mdx) with deprecation notices

All 139 tests pass, confirming backward compatibility works correctly.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude <noreply@anthropic.com>
)

* feat(android): add screenshot polling fallback for remote devices

Implement automatic fallback to screenshot polling mode when connecting to remote Android devices (IP:Port format), since scrcpy cannot connect to remote adb devices.

Changes:
- Refactor ScreenshotViewer to shared component in @midscene/visualizer with function-based props
- Add /api/screenshot endpoint in ScrcpyServer using adb screencap
- Add device type detection to distinguish local vs remote devices
- Conditionally render ScrcpyPlayer (real-time) for local devices or ScreenshotViewer (polling) for remote devices
- Update playground app to use new shared ScreenshotViewer component

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* fix(visualizer): import ExecutionTaskInsightLocate from types module

Fix TypeScript build error by importing ExecutionTaskInsightLocate directly from @midscene/core/types instead of the main export.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* fix(visualizer): define local ExecutionTaskInsightLocate interface

Define ExecutionTaskInsightLocate as a local interface instead of importing from @midscene/core to resolve TypeScript build errors. This type is not properly exported from the core package's type declarations.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* refactor(android): use PlaygroundServer screenshot API instead of duplicating in ScrcpyServer

Remove duplicate screenshot implementation from ScrcpyServer and use the existing PlaygroundServer /screenshot endpoint which already calls AndroidDevice.screenshotBase64(). This eliminates code duplication and leverages the existing infrastructure.

Changes:
- Remove /api/screenshot endpoint from ScrcpyServer
- Update App.tsx to call PlaygroundServer's /screenshot endpoint (port 9412)
- Also use PlaygroundServer's /interface-info endpoint for consistency

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
…nents (#1392)

This change consolidates all PlaygroundSDK creation logic for report
components into a single shared utility module.

Changes:
- Created `apps/report/src/utils/report-playground-utils.ts` with
  `getReportPlaygroundSDK(serviceMode, agent?)` function
- Removed duplicate `getPlaygroundSDK` implementations from
  playground.tsx and playground/index.tsx
- Updated open-in-playground/index.tsx to use the shared function
- Removed unnecessary `createReportPlaygroundSDK` wrapper function
- All report components now use `PLAYGROUND_SERVER_PORT` constant
  from shared package

Benefits:
- Single source of truth for PlaygroundSDK creation in report
  components
- Static report files always connect to localhost:5800
- Reduced code duplication and improved maintainability

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude <noreply@anthropic.com>
* refactor(core): rename Insight class to Service

This is a comprehensive refactoring that renames the Insight class
and all related types to Service for better semantic clarity.

Changes:
- Renamed directories: insight/ -> service/
- Renamed test files: insight.test.ts -> service.test.ts
- Updated 50+ type definitions
- Modified 18+ source files
- Synchronized all test files
- Updated external package dependencies

Core updates:
- Class: Insight -> Service
- Interface: InsightOptions -> ServiceOptions
- All InsightX types -> ServiceX types
- String literal 'Insight' -> 'Service'

Affected files:
- src/index.ts, src/yaml.ts, src/task-runner.ts
- src/agent/*.ts (agent, tasks, task-builder, ui-utils)
- tests/utils.ts and all test files
- External: chrome-extension, evaluation, report

Verification:
- TypeScript: 0 errors
- Lint: 530 files passed
- Build: successful (341.1 kB)

🤖 Generated with Claude Code

Co-Authored-By: Claude <noreply@anthropic.com>

* fix(visualizer): update Insight references to Service

- Updated ExecutionTaskInsightLocate to ExecutionTaskServiceLocate
- Changed task.type check from 'Insight' to 'Service'
- Renamed insightTask variable to serviceTask for consistency

* fix(report): update Insight references to Service

- Updated ExecutionTaskInsightLocate to ExecutionTaskServiceLocate
  in sidebar, detail-side, and detail-panel components
- Changed task.type checks from 'Insight' to 'Service'
- Updated ExecutionTaskInsightAssertion to
  ExecutionTaskServiceAssertion
- Ensures report UI displays Service tasks correctly

* chore(tests): update comments from Insight to Service

* fix(tests): change task type from 'Insight' to 'Service' in tests

- Updated aiaction-cacheable.test.ts
- Updated page-task-executor-waitFor.test.ts
- Completes the Insight to Service refactoring

* fix(tests): update test expectations from 'Insight' to 'Service'

- Updated task-builder.test.ts expectations
- Updated page-task-executor-rightclick.test.ts expectations
- Fixes CI test failures

* refactor(core): use 'Insight' for ExecutionTask types

Keep Service class name but restore ExecutionTask type to 'Insight'
for consistency with UI display requirements.

Changes:
- ExecutionTaskType: 'Service' → 'Insight'
- All ExecutionTaskService* types → ExecutionTaskInsight*
- Runtime checks: task.type === 'Service' → task.type === 'Insight'
- ui-utils.ts: Removed special handling for Query/Assert subtypes
  to display "Insight / Query" and "Insight / Assert" correctly

Type display now follows the expected pattern:
- Planning / Plan
- Planning / Locate
- Action Space / {interface}
- Insight / Query
- Insight / Assert
- Insight / Locate

Files modified:
- packages/core/src/types.ts
- packages/core/src/agent/*.ts
- packages/core/src/task-runner.ts
- packages/visualizer/src/utils/replay-scripts.ts
- apps/report/src/components/**/*.tsx
- All test files

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
Fixed ambiguous descriptions about sequential vs parallel execution:
- Updated --files parameter description to clearly state that files
  execute sequentially by default (when --concurrent=1) and can run
  concurrently with --concurrent parameter
- Removed misleading "run in parallel" text from example that doesn't
  use --concurrent parameter

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude <noreply@anthropic.com>
Add explicit error throwing for failed Assert tasks with detailed
assertion failure messages including the AI's thought process.

This change brings the 1.0 branch in line with the main branch
commit 4761a6c, ensuring that Assert tasks fail explicitly when
the AI cannot verify the condition, rather than silently returning
null values.

Changes:
- Add error throwing for failed Assert tasks in tasks.ts
- Update test to expect error instead of null output

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude <noreply@anthropic.com>
yuyutaotao and others added 26 commits December 10, 2025 18:59
* docs(readme): update automation sections and model strategy copy

* docs(core): update quick start

* docs(core): update docs for model config
…tension (#1572)

* feat(visualizer): enhance Playground SDK with execution dump handling and logging

* fix(local-execution): extract first ExecutionDump from GroupedActionDump response

* feat(agent): enhance task handling to prioritize AI-generated output for Planning tasks

* chore(visualizer): remove debug logging from playground components

Remove console.log debug statements added during development:
- Player component prop logging
- PlaygroundResultView rendering state logging
- usePlaygroundExecution dump structure logging

Keep error handling console.error/warn for production debugging.

* refactor(core): improve code quality and style

- Replace 'any' type with 'unknown' for better type safety
- Add comprehensive JSDoc documentation
- Simplify verbose comments throughout codebase
- Refactor empty branch logic to positive condition check
- Remove redundant implementation detail comments

All changes maintain existing functionality while improving
code maintainability and following TypeScript best practices.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* feat(agent): add progress message handling for task updates and UI display

* feat(agent): enhance progress message handling to track all execution tasks and improve UI updates
test(progress-messages): add integration tests for Planning tasks in progress messages
refactor(playground): update task progress handling to store full progress messages
refactor(visualizer): improve progress message updates to preserve history and avoid duplicates

* refactor(core): remove progress messages from Agent and related components

* feat(visualizer): filter out Planning tasks without output.log in progress items

* refactor(agent): streamline onDumpUpdate method and improve progress message handling

* feat(playground): improve progress item handling by filtering unfinished Planning tasks

* refactor(visualizer): enhance progress message construction and improve filtering logic

* refactor(tests): update dumpDataString to return structured data and remove redundant progress tracking tests

* feat(visualizer): enhance PlaygroundSDKLike interface with optional getServiceMode method
refactor(visualizer): simplify conditional check for service mode in UniversalPlayground component
refactor(visualizer): streamline report file content rendering in Player component
refactor(playground): improve tip message construction in PlaygroundServer class

* fix(package): update build:skip-cache script to include clean command

* fix(local-execution): handle execution errors gracefully and include them in response

* feat(visualizer): enhance error handling and message display in UniversalPlayground

* feat(local-execution): implement listener management for dump updates in LocalExecutionAdapter

* refactor(playground): remove unused storage provider and related imports

* feat(local-execution): add method to retrieve current execution data and reset dump before execution
feat(visualizer): implement color themes for shiny text and update progress description display
fix(usePlaygroundExecution): capture execution data before stopping and update info list accordingly

* refactor(bridge): remove bridgeDB utility functions and related logic

---------

Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
#1580)

Fixes TypeError: Cannot read properties of null (reading 'id')

- Added null/undefined check at the beginning of getSDKId function
- Returns 'playground-default' when SDK is null/undefined
- Aligns with null-safe patterns used elsewhere in the component

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
* chore(core): drop aiActionContext from execution dump

* fix(core): fix lint
* chore(site): add redirects for migrated docs

* chore(site): enable client redirects for migrated docs

* docs(core): update redirect rules

* docs(core): update redirect rules

* docs(core): fix dead link
* fix(core): normalize aiActContext usage in YAML scripts

* docs(agent): clarify legacy aiActContext alias usage

* fix(core): preference parser for yaml
…1588)

* fix(visualizer): improve dark mode styling for ShinyText component

- Replace solid purple text color with vibrant gradient (#a78bfa → #c084fc → #e879f9) in dark mode
- Add subtle purple shine effect (15% opacity) instead of harsh white shine
- Maintain animated gradient and shine animations in dark mode
- Add dark mode styling for PlaygroundResult component (loading text and pre blocks)
- Improve type safety in replay-scripts.ts with null checks and type normalization

* fix(web-integration): update aiActionContext priority and tests

Change the resolution priority to prefer agent-level preference over
target-level context:

Priority order (highest to lowest):
1. preference.aiActContext (recommended)
2. preference.aiActionContext (deprecated)
3. target.aiActionContext (target-level fallback)

Updated:
- Function comment to reflect correct priority order
- Test cases to validate new priority behavior
- Added comprehensive test coverage (5 test cases)

This allows agent-level preferences to override target-level context,
providing more flexibility in runtime configuration while maintaining
backward compatibility with the deprecated aiActionContext property.
…atch mode (#1593)

- Add rsbuild-plugin-workspace-dev plugin to all Rsbuild configurations
- Configure plugin to skip @midscene/report in workspace dev mode
- Add --no-clean flag to all build:watch scripts for better incremental builds
- Update dependencies in affected packages
Previously, when displaying array outputs (like Query task results),
the output was converted using String(data), which caused arrays of
objects to display as "[object Object], [object Object], ..."

This fix replaces String(data) with JSON.stringify(data, undefined, 2)
to properly serialize and format the output data.

Fixes the display of Query/Insight task outputs that return arrays.
…APIs (#1596)

For data extraction APIs (aiQuery, aiBoolean, aiNumber, aiString, aiAsk,
aiAssert, aiWaitFor), the playground now displays both:
- Output result (at the top) - the extracted/queried data
- Report (at the bottom) - detailed execution report with screenshots

This allows users to see both the direct result and the full execution
context for better debugging and understanding.

Changes:
- Generate replayScriptsInfo for all APIs including noReplayAPIs
- Add actionType field to InfoListItem to track API type
- Adjust display logic to show both output and report when appropriate
- Pass actionType through the component chain
The replaceIllegalPathCharsAndSpace function now replaces the # character
with a dash to prevent issues with file paths, particularly in HTML report
filenames when tests retry (e.g., "retry #1" becomes "retry -1").

Changes:
- Updated regex in replaceIllegalPathCharsAndSpace to include #
- Added test case to verify # character replacement
@quanru quanru marked this pull request as draft December 17, 2025 02:48
@yuyutaotao yuyutaotao marked this pull request as ready for review December 17, 2025 02:48
@quanru quanru merged commit db0a780 into main Dec 17, 2025
7 of 12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants