fix: detect and dismiss Windows lock screen before each task by abrichr · Pull Request #117 · OpenAdaptAI/openadapt-evals

abrichr · 2026-03-08T16:29:15Z

Summary

Adds _dismiss_lock_screen() to run_dc_eval.py that detects LogonUI.exe process and types password to unlock
Called from ensure_waa_ready() after each successful probe, before every task
Prevents eval failures when Windows VM has been idle and lock screen engaged

Context

Windows lock screen was likely a contributing factor to infrastructure failures in previous eval trials. When the VM sits idle between tasks or sessions, Windows can lock the screen. The agent then sees a login UI instead of the desktop, causing focus checks to fail and tasks to abort.

Test plan

Verified LogonUI.exe detection returns False on unlocked VM
/evaluate endpoint confirmed working after QEMU reset (reverse proxy to container evaluate_server on port 5050)
Non-blocking: check errors don't prevent task execution

🤖 Generated with Claude Code

Implements the correction flywheel MVP: - correction_store.py: JSON-file-based correction library with save/find (fuzzy string matching via SequenceMatcher)/load_all - correction_capture.py: Human correction capture using openadapt-capture Recorder (primary) with PIL screenshot fallback - correction_parser.py: VLM call to parse before/after screenshots into PlanStep dict (think/action/expect) - demo_controller.py: Added correction_store and enable_correction_capture params. On retry exhaustion: check correction store -> inject match, or capture human correction -> parse -> store -> advance - cli.py: Added --correction-library and --enable-correction-capture flags The loop: agent fails at step N -> correction store checked -> if match, inject corrected step -> if no match and capture enabled, human completes step -> Recorder captures -> VLM parses -> correction stored -> next run retrieves it. 17 tests added, all passing. 54 existing demo_controller tests unaffected. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The test was calling the real Recorder which may not have wait_for_ready in the installed version. Mock it to use the simple fallback path since this is a unit test. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add _dismiss_lock_screen() to run_dc_eval.py that checks for LogonUI.exe process and types the password to unlock if the screen is locked. Called from ensure_waa_ready() after each successful probe. This prevents eval failures when the Windows VM has been idle and the lock screen has engaged between tasks or between sessions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

abrichr and others added 4 commits March 8, 2026 12:32

fix: mock _has_recorder in correction capture test

ead04ec

The test was calling the real Recorder which may not have wait_for_ready in the installed version. Mock it to use the simple fallback path since this is a unit test. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

chore: sync beads state

8d2d11d

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

abrichr force-pushed the fix/lock-screen-dismiss branch from 03e2fe7 to 8d2d11d Compare March 8, 2026 16:33

abrichr merged commit 4a28653 into main Mar 8, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: detect and dismiss Windows lock screen before each task#117

fix: detect and dismiss Windows lock screen before each task#117
abrichr merged 4 commits intomainfrom
fix/lock-screen-dismiss

abrichr commented Mar 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

abrichr commented Mar 8, 2026

Summary

Context

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant