Conversation
… in mobiagent - Introduced robust JSON parsing and validation functions to enhance error handling and response integrity. - Implemented retry logic with temperature adjustments for model API calls to improve response reliability. - Enhanced swipe functionality with dynamic coordinate calculations based on image dimensions. feat(annotate): Enhance annotation functionality and optimize task description processing - Add reasoning quantity verification to ensure the correct quantity of react is output for non-final batches. - Modify the system prompt template to the enhanced version of annotation_zh_general-gpt.md. - Add a skipping prompt when react.json exists. - Correct the processing logic when the task description is in the form of a list. fix(config): Adjust data augmentation configuration and action verification feat(manual): Expand the list of device support and applications docs(prompts): Update prompt templates and operation space definitions - Create enhanced version of the annotation_zh_general-gpt.md prompt template - Update the definition of the DONE action in annotation_zh_general.md - Add support for wait operations and adjust related descriptions - Add advertising processing rules in auto_decider.md - Update the application list and default application markers in the planner template
There was a problem hiding this comment.
Pull request overview
This pull request refactors the mobiagent codebase to standardize code style, enhance error handling, and eliminate redundant code. The changes introduce robust JSON parsing with retry logic, improve model interaction reliability, and optimize swipe functionality with dynamic coordinate calculations.
Key Changes
- Enhanced JSON parsing and validation: Introduced unified JSON parsing functions (
_load_json_from_text,robust_json_loads) with robust error handling that can extract JSON from various text formats including markdown code blocks - Retry logic with temperature adjustment: Implemented
call_model_with_validation_retryfunction that automatically retries API calls with incremental temperature adjustments on failure - Improved swipe functionality: Refactored swipe operations to use configurable constants (
SWIPE_V_START,SWIPE_V_END, etc.) and a centralizedcompute_swipe_positionsfunction for consistent coordinate calculations
Reviewed changes
Copilot reviewed 15 out of 16 changed files in this pull request and generated 38 comments.
Show a summary per file
| File | Description |
|---|---|
| runner/mobiagent/mobiagent.py | Main refactoring with new utility functions, constants, validation logic, and improved model API calls with retry mechanisms |
| runner/mobiagent/task.json | Updated task list with new Chinese test cases for various e-commerce and service apps |
| prompts/e2e_qwen3.md | Fixed quote formatting in done action status parameter |
| prompts/e2e_nohistory_qwen3.md | Fixed quote formatting in done action status parameter |
| prompts/planner_oneshot_harmony.md | Added new app entries with some formatting inconsistencies |
| prompts/planner_oneshot.md | Added new app entries and default app markers |
| prompts/auto_decider.md | Added advertising handling rules and renumbered operation guidelines |
| prompts/annotation_zh_general.md | Updated DONE action definition and added WAIT operation support |
| prompts/annotation_zh_general-gpt.md | New enhanced version of annotation prompt template |
| collect/manual/static/index.html | Added "华为商城" option to application list |
| collect/manual/server.py | Added package name mapping for "华为商城" app |
| collect/manual/device.py | Added package name mapping for "华为商城" app |
| collect/construct_sft.py | Enhanced dataset construction with e2e support, improved validation, and better error handling |
| collect/annotate.py | Added reasoning quantity verification and improved task description processing |
| collect/augment_config.json | Adjusted data augmentation multipliers for wait and black screen scenarios |
| .gitignore | Added log/ directory to ignored files |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| device.swipe_with_coords(press_position_x, press_position_y, release_position_x, release_position_y,) | ||
| actions.append({ | ||
| "type": "swipe", | ||
| "press_position_x": None, | ||
| "press_position_y": None, | ||
| "release_position_x": None, | ||
| "release_position_y": None, | ||
| "press_position_x": press_position_x, | ||
| "press_position_y": press_position_y, | ||
| "release_position_x": release_position_x, | ||
| "release_position_y": release_position_y, | ||
| "direction": direction.lower(), | ||
| "action_index": image_index | ||
| }) | ||
| history.append(decider_response_str) | ||
| history.append(json.dumps(decider_response, ensure_ascii=False)) | ||
| create_swipe_visualization(data_dir, image_index, direction.lower()) | ||
| else: | ||
| if direction in ["DOWN", "UP", "LEFT", "RIGHT"]: | ||
| device.swipe(direction.lower(), 0.4) | ||
| actions.append({ | ||
| "type": "swipe", | ||
| "press_position_x": None, | ||
| "press_position_y": None, | ||
| "release_position_x": None, | ||
| "release_position_y": None, | ||
| "direction": direction.lower(), | ||
| "action_index": image_index | ||
| }) | ||
|
|
||
| history.append(decider_response_str) | ||
| press_position_x, press_position_y, release_position_x, release_position_y = compute_swipe_positions(direction, img.width, img.height) | ||
| device.swipe_with_coords(press_position_x, press_position_y, release_position_x, release_position_y,) |
There was a problem hiding this comment.
Trailing comma on line 1062 and 1076. The swipe_with_coords function call has an unnecessary trailing comma after the last parameter release_position_y,). This is not standard Python style for function calls and should be removed.
| pos_num_repeat = position_num_repeat(i, len(react_data)) #根据步骤位置设置重复次数 | ||
| reason_aug_num_repeat = augment_num_repeat("decider", augment_rule, is_train) # 在训练时才进行数据扩充 |
There was a problem hiding this comment.
The comment on line 445 states "根据步骤位置设置重复次数" (set repetition count based on step position), but on line 446, there's a separate comment "在训练时才进行数据扩充" (only perform data augmentation during training). These two comments describe different aspects of the logic but are placed on consecutive lines without clear separation. Consider consolidating or better organizing these comments for clarity.
| pos_num_repeat = position_num_repeat(i, len(react_data)) #根据步骤位置设置重复次数 | |
| reason_aug_num_repeat = augment_num_repeat("decider", augment_rule, is_train) # 在训练时才进行数据扩充 | |
| # 根据步骤位置设置重复次数,并在训练阶段按规则决定推理文本的数据扩充次数 | |
| pos_num_repeat = position_num_repeat(i, len(react_data)) | |
| reason_aug_num_repeat = augment_num_repeat("decider", augment_rule, is_train) |
| # 预处理 response_str,增强健壮性 | ||
| def robust_json_loads(s): | ||
| """健壮解析 response_str,使用统一解析逻辑。""" | ||
| parsed = _load_json_from_text(s) | ||
| if parsed is None: | ||
| logging.error(f"解析 response_str 失败") | ||
| logging.error(f"原始内容: {str(s)[:300]}...") | ||
| raise ValueError(f"无法解析 JSON 响应: 响应格式不正确") |
There was a problem hiding this comment.
The comment "预处理 response_str,增强健壮性" on line 641 is inconsistent with the actual function name and purpose. The comment references "response_str" but the function is named robust_json_loads which takes parameter s. The comment should be updated to accurately reflect the function's purpose.
| # 预处理 response_str,增强健壮性 | |
| def robust_json_loads(s): | |
| """健壮解析 response_str,使用统一解析逻辑。""" | |
| parsed = _load_json_from_text(s) | |
| if parsed is None: | |
| logging.error(f"解析 response_str 失败") | |
| logging.error(f"原始内容: {str(s)[:300]}...") | |
| raise ValueError(f"无法解析 JSON 响应: 响应格式不正确") | |
| # 预处理 JSON 文本字符串,增强健壮性 | |
| def robust_json_loads(s): | |
| """健壮解析 JSON 字符串,使用统一解析逻辑。""" | |
| parsed = _load_json_from_text(s) | |
| if parsed is None: | |
| logging.error("解析 JSON 字符串失败") | |
| logging.error(f"原始内容: {str(s)[:300]}...") | |
| raise ValueError("无法解析 JSON 响应: 响应格式不正确") |
| logging.error(f"解析 response_str 失败") | ||
| logging.error(f"原始内容: {str(s)[:300]}...") | ||
| raise ValueError(f"无法解析 JSON 响应: 响应格式不正确") |
There was a problem hiding this comment.
Inconsistent Chinese terminology in error messages. Line 646 uses "解析 response_str 失败" while line 468 uses "无法在响应中找到有效的 JSON". Consider standardizing the error message format and terminology across similar error conditions for better maintainability.
| logging.error(f"解析 response_str 失败") | |
| logging.error(f"原始内容: {str(s)[:300]}...") | |
| raise ValueError(f"无法解析 JSON 响应: 响应格式不正确") | |
| logging.error("无法在响应中找到有效的 JSON") | |
| logging.error(f"原始内容: {str(s)[:300]}...") | |
| raise ValueError("无法在响应中找到有效的 JSON: 响应格式不正确") |
| direction = direction.upper() | ||
| if direction == "DOWN": | ||
| return ( | ||
| 0.5 * img_width, | ||
| SWIPE_V_START * img_height, | ||
| 0.5 * img_width, | ||
| SWIPE_V_END * img_height, | ||
| ) | ||
| if direction == "UP": | ||
| return ( | ||
| 0.5 * img_width, | ||
| SWIPE_V_END * img_height, | ||
| 0.5 * img_width, | ||
| SWIPE_V_START * img_height, | ||
| ) | ||
| if direction == "LEFT": | ||
| return ( | ||
| SWIPE_H_END * img_width, | ||
| 0.5 * img_height, | ||
| SWIPE_H_START * img_width, | ||
| 0.5 * img_height, | ||
| ) | ||
| if direction == "RIGHT": | ||
| return ( | ||
| SWIPE_H_START * img_width, | ||
| 0.5 * img_height, | ||
| SWIPE_H_END * img_width, | ||
| 0.5 * img_height, | ||
| ) | ||
| raise ValueError(f"Unknown swipe direction: {direction}") |
There was a problem hiding this comment.
The compute_swipe_positions function is missing proper indentation. The function body starts at line 722 without proper indentation relative to the function definition on line 721. This will cause a syntax error.
| direction = direction.upper() | |
| if direction == "DOWN": | |
| return ( | |
| 0.5 * img_width, | |
| SWIPE_V_START * img_height, | |
| 0.5 * img_width, | |
| SWIPE_V_END * img_height, | |
| ) | |
| if direction == "UP": | |
| return ( | |
| 0.5 * img_width, | |
| SWIPE_V_END * img_height, | |
| 0.5 * img_width, | |
| SWIPE_V_START * img_height, | |
| ) | |
| if direction == "LEFT": | |
| return ( | |
| SWIPE_H_END * img_width, | |
| 0.5 * img_height, | |
| SWIPE_H_START * img_width, | |
| 0.5 * img_height, | |
| ) | |
| if direction == "RIGHT": | |
| return ( | |
| SWIPE_H_START * img_width, | |
| 0.5 * img_height, | |
| SWIPE_H_END * img_width, | |
| 0.5 * img_height, | |
| ) | |
| raise ValueError(f"Unknown swipe direction: {direction}") | |
| direction = direction.upper() | |
| if direction == "DOWN": | |
| return ( | |
| 0.5 * img_width, | |
| SWIPE_V_START * img_height, | |
| 0.5 * img_width, | |
| SWIPE_V_END * img_height, | |
| ) | |
| if direction == "UP": | |
| return ( | |
| 0.5 * img_width, | |
| SWIPE_V_END * img_height, | |
| 0.5 * img_width, | |
| SWIPE_V_START * img_height, | |
| ) | |
| if direction == "LEFT": | |
| return ( | |
| SWIPE_H_END * img_width, | |
| 0.5 * img_height, | |
| SWIPE_H_START * img_width, | |
| 0.5 * img_height, | |
| ) | |
| if direction == "RIGHT": | |
| return ( | |
| SWIPE_H_START * img_width, | |
| 0.5 * img_height, | |
| SWIPE_H_END * img_width, | |
| 0.5 * img_height, | |
| ) | |
| raise ValueError(f"Unknown swipe direction: {direction}") |
| actions_path = os.path.join(root, "actions.json") | ||
| actions = [] | ||
| if os.path.exists(actions_path): | ||
| with open(actions_path, "r", encoding="UTF-8") as f: | ||
| try: | ||
| actions_data = json.load(f) | ||
| actions = actions_data.get("actions", []) | ||
| except: | ||
| pass | ||
|
|
There was a problem hiding this comment.
This assignment to 'actions' is unnecessary as it is redefined before this value is used.
| actions_path = os.path.join(root, "actions.json") | |
| actions = [] | |
| if os.path.exists(actions_path): | |
| with open(actions_path, "r", encoding="UTF-8") as f: | |
| try: | |
| actions_data = json.load(f) | |
| actions = actions_data.get("actions", []) | |
| except: | |
| pass |
| try: | ||
| actions_data = json.load(f) | ||
| actions = actions_data.get("actions", []) | ||
| except: |
There was a problem hiding this comment.
Except block directly handles BaseException.
| except: | |
| except Exception: |
| try: | ||
| actions_data = json.load(f) | ||
| actions = actions_data.get("actions", []) | ||
| except: |
There was a problem hiding this comment.
Except block directly handles BaseException.
| except: | |
| except Exception: |
| except: | ||
| pass |
There was a problem hiding this comment.
'except' clause does nothing but pass and there is no explanatory comment.
| except: | |
| pass | |
| except (json.JSONDecodeError, OSError) as e: | |
| print(f"警告:无法读取或解析 '{actions_path}':{e},将不使用 actions。") |
| except: | ||
| pass |
There was a problem hiding this comment.
'except' clause does nothing but pass and there is no explanatory comment.
| except: | |
| pass | |
| except Exception as e: | |
| # Ignore invalid or unreadable actions.json, defaulting to empty actions | |
| print(f"警告:无法读取或解析动作文件 '{actions_path}': {e},将使用空 actions 列表。") |
feat(mobiagent): Refactor task handling and enhance model interaction in mobiagent
feat(annotate): Enhance annotation functionality and optimize task description processing
fix(config): Adjust data augmentation configuration and action verification
feat(manual): Expand the list of device support and applications
docs(prompts): Update prompt templates and operation space definitions