feat(agent): add DSL execution engine for LLM-scripted code analysis#412
feat(agent): add DSL execution engine for LLM-scripted code analysis#412
Conversation
- Fix XML tag stripping: stripCodeWrapping() removes <execute_plan><code> tags LLMs wrap code in, preventing validation errors and wasted self-healing retries - Add regex literal validation: Validator rejects /pattern/ with clear error message, LLMs steered to String methods (indexOf, includes, startsWith) - Fix async error crashes: Delay unhandledRejection handler removal by 500ms to catch late SandboxJS errors that escape promise chain - Fix ambiguous test prompts: Test 2 now explicitly requests DSL code output instead of plain text answers - Update llm-script.md docs: Add output() function, parseJSON() utility, regex limitation, and Pattern 6 (Direct Output for Large Data) - All 6 agent tests now pass (up from 4/6) - 110 DSL unit tests pass (+2 regex validation tests) - 1974 total npm tests pass with zero regressions Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
afbc45a to
059ff17
Compare
PR Overview: DSL Execution Engine for LLM-Scripted Code AnalysisSummaryThis PR introduces a DSL execution engine ( Files Changed (28 files, +5,894/-7)
ArchitectureKey Technical Changes
Affected Components
New Dependencies
Breaking ChangesNone. The Test Coverage
Usageconst agent = new ProbeAgent({
path: '/path/to/codebase',
provider: 'google',
enableExecutePlan: true // Enable DSL orchestration
});probe agent "Find all API endpoints" --enable-execute-planScope Discovery & Related FilesThe DSL modules are self-contained in
Metadata
Powered by Visor from Probelabs Last updated: 2026-02-15T19:25:31.533Z | Triggered by: pr_updated | Commit: c80a0a7 💡 TIP: You can chat with Visor using |
Security Issues (1)
Architecture Issues (1)
Performance Issues (1)
Quality Issues (1)
Powered by Visor from Probelabs Last updated: 2026-02-15T19:25:34.060Z | Triggered by: pr_updated | Commit: c80a0a7 💡 TIP: You can chat with Visor using |
SandboxJS doesn't bind catch clause parameters. The workaround injects `var e = __getLastError()` in catch bodies, but `var e` inside `catch (e)` conflicts on Node 20. Fix by renaming the catch parameter to `__catchParam` so the var declaration doesn't shadow it. Also set Visor code review max-parallelism to 1. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Tool functions now catch errors internally and return "ERROR: ..." strings instead of throwing. This eliminates the fragile SandboxJS catch parameter workaround (__getLastError/__setLastError/__catchParam) which broke on Node 20. - traceToolCall: catch + return error string instead of rethrowing - parseJSON: returns null on failure instead of throwing - Removed transformer passes: catch param rename, throw rewrite - Removed errorHolder, __getLastError, __setLastError globals - Updated tool definition and docs to document error-return pattern - Net -147 lines of complexity Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
Adds a new DSL execution engine (
execute_plantool) that lets the AI agent write and run JavaScript-like scripts in a sandboxed environment. This replaces verbose multi-turn tool-calling loops with compact, deterministic scripts — enabling complex multi-file analysis, data pipelines, and structured output generation in a single tool call.Why?
The current agent loop calls tools one at a time: search → extract → LLM → search → extract → ... Each round-trip costs latency and context window. With
execute_plan, the agent writes a script that orchestrates all tool calls in one shot — with loops, variables, parallel batching, and direct output delivery.Architecture
4 core modules:
dsl/validator.js) — Whitelist-based AST validation using acorn. Blocks dangerous constructs (eval, Function, import) while allowing standard JS control flow.dsl/transformer.js) — Auto-injectsawaiton tool calls, wraps code in async IIFE, injects loop guards to prevent infinite loops.dsl/environment.js) — Generates sandbox globals: tool wrappers, session store, output buffer, utility functions (LLM, parseJSON, batch, map, etc).dsl/runtime.js) — Executes transformed code in SandboxJS with configurable timeout (default 120s) and loop iteration limits.Key Features
"ERROR: message"strings instead of throwing.parseJSON()returnsnullon failure. This avoids SandboxJS try/catch parameter binding bugs entirely.execute_plancalls within one agent session (storeSet,storeGet,storeAppend,storeKeys,storeGetAll).output()function writes large data (tables, CSV, JSON) directly to the user response, bypassing LLM context window and preventing lossy summarization.batch(items, size)+map(items, fn)for concurrent file processing.enableExecutePlanflag: Gates the tool (likeenableBash/enableDelegate). When enabled, replacesanalyze_all.Files Changed (29 files, ~5900 lines added)
dsl/validator.js,transformer.js,environment.js,runtime.jstools/executePlan.js,tools/common.js,tools/index.jsProbeAgent.js,probeTool.js,tools.js,index.jstests/unit/dsl-*.test.js(3 files, ~1260 lines)dsl/*-test.mjs(7 files)docs/llm-script.md.github/workflows/visor.ymlTest plan
npm test --prefix npm -- --testPathPattern="dsl-")🤖 Generated with Claude Code