Skip to content

feat(mobiagent): Standardize code style and eliminate redundant code#66

Open
Fly0307 wants to merge 1 commit intomainfrom
zx-test
Open

feat(mobiagent): Standardize code style and eliminate redundant code#66
Fly0307 wants to merge 1 commit intomainfrom
zx-test

Conversation

@Fly0307
Copy link
Collaborator

@Fly0307 Fly0307 commented Jan 5, 2026

feat(mobiagent): Refactor task handling and enhance model interaction in mobiagent

  • Introduced robust JSON parsing and validation functions to enhance error handling and response integrity.
  • Implemented retry logic with temperature adjustments for model API calls to improve response reliability.
  • Enhanced swipe functionality with dynamic coordinate calculations based on image dimensions.

feat(annotate): Enhance annotation functionality and optimize task description processing

  • Add reasoning quantity verification to ensure the correct quantity of react is output for non-final batches.
  • Modify the system prompt template to the enhanced version of annotation_zh_general-gpt.md.
  • Add a skipping prompt when react.json exists.
  • Correct the processing logic when the task description is in the form of a list.

fix(config): Adjust data augmentation configuration and action verification

feat(manual): Expand the list of device support and applications

docs(prompts): Update prompt templates and operation space definitions

  • Create enhanced version of the annotation_zh_general-gpt.md prompt template
  • Update the definition of the DONE action in annotation_zh_general.md
  • Add support for wait operations and adjust related descriptions
  • Add advertising processing rules in auto_decider.md
  • Update the application list and default application markers in the planner template

… in mobiagent

- Introduced robust JSON parsing and validation functions to enhance error handling and response integrity.
- Implemented retry logic with temperature adjustments for model API calls to improve response reliability.
- Enhanced swipe functionality with dynamic coordinate calculations based on image dimensions.

feat(annotate): Enhance annotation functionality and optimize task description processing
- Add reasoning quantity verification to ensure the correct quantity of react is output for non-final batches.
- Modify the system prompt template to the enhanced version of annotation_zh_general-gpt.md.
- Add a skipping prompt when react.json exists.
- Correct the processing logic when the task description is in the form of a list.

fix(config): Adjust data augmentation configuration and action verification

feat(manual): Expand the list of device support and applications

docs(prompts): Update prompt templates and operation space definitions
- Create enhanced version of the annotation_zh_general-gpt.md prompt template
- Update the definition of the DONE action in annotation_zh_general.md
- Add support for wait operations and adjust related descriptions
- Add advertising processing rules in auto_decider.md
- Update the application list and default application markers in the planner template
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request refactors the mobiagent codebase to standardize code style, enhance error handling, and eliminate redundant code. The changes introduce robust JSON parsing with retry logic, improve model interaction reliability, and optimize swipe functionality with dynamic coordinate calculations.

Key Changes

  • Enhanced JSON parsing and validation: Introduced unified JSON parsing functions (_load_json_from_text, robust_json_loads) with robust error handling that can extract JSON from various text formats including markdown code blocks
  • Retry logic with temperature adjustment: Implemented call_model_with_validation_retry function that automatically retries API calls with incremental temperature adjustments on failure
  • Improved swipe functionality: Refactored swipe operations to use configurable constants (SWIPE_V_START, SWIPE_V_END, etc.) and a centralized compute_swipe_positions function for consistent coordinate calculations

Reviewed changes

Copilot reviewed 15 out of 16 changed files in this pull request and generated 38 comments.

Show a summary per file
File Description
runner/mobiagent/mobiagent.py Main refactoring with new utility functions, constants, validation logic, and improved model API calls with retry mechanisms
runner/mobiagent/task.json Updated task list with new Chinese test cases for various e-commerce and service apps
prompts/e2e_qwen3.md Fixed quote formatting in done action status parameter
prompts/e2e_nohistory_qwen3.md Fixed quote formatting in done action status parameter
prompts/planner_oneshot_harmony.md Added new app entries with some formatting inconsistencies
prompts/planner_oneshot.md Added new app entries and default app markers
prompts/auto_decider.md Added advertising handling rules and renumbered operation guidelines
prompts/annotation_zh_general.md Updated DONE action definition and added WAIT operation support
prompts/annotation_zh_general-gpt.md New enhanced version of annotation prompt template
collect/manual/static/index.html Added "华为商城" option to application list
collect/manual/server.py Added package name mapping for "华为商城" app
collect/manual/device.py Added package name mapping for "华为商城" app
collect/construct_sft.py Enhanced dataset construction with e2e support, improved validation, and better error handling
collect/annotate.py Added reasoning quantity verification and improved task description processing
collect/augment_config.json Adjusted data augmentation multipliers for wait and black screen scenarios
.gitignore Added log/ directory to ignored files

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +1062 to +1076
device.swipe_with_coords(press_position_x, press_position_y, release_position_x, release_position_y,)
actions.append({
"type": "swipe",
"press_position_x": None,
"press_position_y": None,
"release_position_x": None,
"release_position_y": None,
"press_position_x": press_position_x,
"press_position_y": press_position_y,
"release_position_x": release_position_x,
"release_position_y": release_position_y,
"direction": direction.lower(),
"action_index": image_index
})
history.append(decider_response_str)
history.append(json.dumps(decider_response, ensure_ascii=False))
create_swipe_visualization(data_dir, image_index, direction.lower())
else:
if direction in ["DOWN", "UP", "LEFT", "RIGHT"]:
device.swipe(direction.lower(), 0.4)
actions.append({
"type": "swipe",
"press_position_x": None,
"press_position_y": None,
"release_position_x": None,
"release_position_y": None,
"direction": direction.lower(),
"action_index": image_index
})

history.append(decider_response_str)
press_position_x, press_position_y, release_position_x, release_position_y = compute_swipe_positions(direction, img.width, img.height)
device.swipe_with_coords(press_position_x, press_position_y, release_position_x, release_position_y,)
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trailing comma on line 1062 and 1076. The swipe_with_coords function call has an unnecessary trailing comma after the last parameter release_position_y,). This is not standard Python style for function calls and should be removed.

Copilot uses AI. Check for mistakes.
Comment on lines +445 to +446
pos_num_repeat = position_num_repeat(i, len(react_data)) #根据步骤位置设置重复次数
reason_aug_num_repeat = augment_num_repeat("decider", augment_rule, is_train) # 在训练时才进行数据扩充
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment on line 445 states "根据步骤位置设置重复次数" (set repetition count based on step position), but on line 446, there's a separate comment "在训练时才进行数据扩充" (only perform data augmentation during training). These two comments describe different aspects of the logic but are placed on consecutive lines without clear separation. Consider consolidating or better organizing these comments for clarity.

Suggested change
pos_num_repeat = position_num_repeat(i, len(react_data)) #根据步骤位置设置重复次数
reason_aug_num_repeat = augment_num_repeat("decider", augment_rule, is_train) # 在训练时才进行数据扩充
# 根据步骤位置设置重复次数,并在训练阶段按规则决定推理文本的数据扩充次数
pos_num_repeat = position_num_repeat(i, len(react_data))
reason_aug_num_repeat = augment_num_repeat("decider", augment_rule, is_train)

Copilot uses AI. Check for mistakes.
Comment on lines +641 to +648
# 预处理 response_str,增强健壮性
def robust_json_loads(s):
"""健壮解析 response_str,使用统一解析逻辑。"""
parsed = _load_json_from_text(s)
if parsed is None:
logging.error(f"解析 response_str 失败")
logging.error(f"原始内容: {str(s)[:300]}...")
raise ValueError(f"无法解析 JSON 响应: 响应格式不正确")
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment "预处理 response_str,增强健壮性" on line 641 is inconsistent with the actual function name and purpose. The comment references "response_str" but the function is named robust_json_loads which takes parameter s. The comment should be updated to accurately reflect the function's purpose.

Suggested change
# 预处理 response_str,增强健壮性
def robust_json_loads(s):
"""健壮解析 response_str,使用统一解析逻辑。"""
parsed = _load_json_from_text(s)
if parsed is None:
logging.error(f"解析 response_str 失败")
logging.error(f"原始内容: {str(s)[:300]}...")
raise ValueError(f"无法解析 JSON 响应: 响应格式不正确")
# 预处理 JSON 文本字符串,增强健壮性
def robust_json_loads(s):
"""健壮解析 JSON 字符串,使用统一解析逻辑。"""
parsed = _load_json_from_text(s)
if parsed is None:
logging.error("解析 JSON 字符串失败")
logging.error(f"原始内容: {str(s)[:300]}...")
raise ValueError("无法解析 JSON 响应: 响应格式不正确")

Copilot uses AI. Check for mistakes.
Comment on lines +646 to +648
logging.error(f"解析 response_str 失败")
logging.error(f"原始内容: {str(s)[:300]}...")
raise ValueError(f"无法解析 JSON 响应: 响应格式不正确")
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent Chinese terminology in error messages. Line 646 uses "解析 response_str 失败" while line 468 uses "无法在响应中找到有效的 JSON". Consider standardizing the error message format and terminology across similar error conditions for better maintainability.

Suggested change
logging.error(f"解析 response_str 失败")
logging.error(f"原始内容: {str(s)[:300]}...")
raise ValueError(f"无法解析 JSON 响应: 响应格式不正确")
logging.error("无法在响应中找到有效的 JSON")
logging.error(f"原始内容: {str(s)[:300]}...")
raise ValueError("无法在响应中找到有效的 JSON: 响应格式不正确")

Copilot uses AI. Check for mistakes.
Comment on lines +722 to +751
direction = direction.upper()
if direction == "DOWN":
return (
0.5 * img_width,
SWIPE_V_START * img_height,
0.5 * img_width,
SWIPE_V_END * img_height,
)
if direction == "UP":
return (
0.5 * img_width,
SWIPE_V_END * img_height,
0.5 * img_width,
SWIPE_V_START * img_height,
)
if direction == "LEFT":
return (
SWIPE_H_END * img_width,
0.5 * img_height,
SWIPE_H_START * img_width,
0.5 * img_height,
)
if direction == "RIGHT":
return (
SWIPE_H_START * img_width,
0.5 * img_height,
SWIPE_H_END * img_width,
0.5 * img_height,
)
raise ValueError(f"Unknown swipe direction: {direction}")
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The compute_swipe_positions function is missing proper indentation. The function body starts at line 722 without proper indentation relative to the function definition on line 721. This will cause a syntax error.

Suggested change
direction = direction.upper()
if direction == "DOWN":
return (
0.5 * img_width,
SWIPE_V_START * img_height,
0.5 * img_width,
SWIPE_V_END * img_height,
)
if direction == "UP":
return (
0.5 * img_width,
SWIPE_V_END * img_height,
0.5 * img_width,
SWIPE_V_START * img_height,
)
if direction == "LEFT":
return (
SWIPE_H_END * img_width,
0.5 * img_height,
SWIPE_H_START * img_width,
0.5 * img_height,
)
if direction == "RIGHT":
return (
SWIPE_H_START * img_width,
0.5 * img_height,
SWIPE_H_END * img_width,
0.5 * img_height,
)
raise ValueError(f"Unknown swipe direction: {direction}")
direction = direction.upper()
if direction == "DOWN":
return (
0.5 * img_width,
SWIPE_V_START * img_height,
0.5 * img_width,
SWIPE_V_END * img_height,
)
if direction == "UP":
return (
0.5 * img_width,
SWIPE_V_END * img_height,
0.5 * img_width,
SWIPE_V_START * img_height,
)
if direction == "LEFT":
return (
SWIPE_H_END * img_width,
0.5 * img_height,
SWIPE_H_START * img_width,
0.5 * img_height,
)
if direction == "RIGHT":
return (
SWIPE_H_START * img_width,
0.5 * img_height,
SWIPE_H_END * img_width,
0.5 * img_height,
)
raise ValueError(f"Unknown swipe direction: {direction}")

Copilot uses AI. Check for mistakes.
Comment on lines +310 to +319
actions_path = os.path.join(root, "actions.json")
actions = []
if os.path.exists(actions_path):
with open(actions_path, "r", encoding="UTF-8") as f:
try:
actions_data = json.load(f)
actions = actions_data.get("actions", [])
except:
pass

Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assignment to 'actions' is unnecessary as it is redefined before this value is used.

Suggested change
actions_path = os.path.join(root, "actions.json")
actions = []
if os.path.exists(actions_path):
with open(actions_path, "r", encoding="UTF-8") as f:
try:
actions_data = json.load(f)
actions = actions_data.get("actions", [])
except:
pass

Copilot uses AI. Check for mistakes.
try:
actions_data = json.load(f)
actions = actions_data.get("actions", [])
except:
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except block directly handles BaseException.

Suggested change
except:
except Exception:

Copilot uses AI. Check for mistakes.
try:
actions_data = json.load(f)
actions = actions_data.get("actions", [])
except:
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except block directly handles BaseException.

Suggested change
except:
except Exception:

Copilot uses AI. Check for mistakes.
Comment on lines +216 to +217
except:
pass
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'except' clause does nothing but pass and there is no explanatory comment.

Suggested change
except:
pass
except (json.JSONDecodeError, OSError) as e:
print(f"警告:无法读取或解析 '{actions_path}':{e},将不使用 actions。")

Copilot uses AI. Check for mistakes.
Comment on lines +317 to +318
except:
pass
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'except' clause does nothing but pass and there is no explanatory comment.

Suggested change
except:
pass
except Exception as e:
# Ignore invalid or unreadable actions.json, defaulting to empty actions
print(f"警告:无法读取或解析动作文件 '{actions_path}': {e},将使用空 actions 列表。")

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants