Skip to content

VectorlyApp/web-hacker

Repository files navigation

web-hacker

REVERSE ENGINEER ANY WEB APP! ⚡️

You are in the right place if you ...

  • want your AI agent to take real actions on the web
  • never want to pay for an API (except for OpenAI... shouldn't piss them off...)
  • are tired of complicated, endless API integrations
  • dealing with closed APIs

Welcome to Vectorly's Web Hacker... No API? No Problem!

Our Process ᯓ ✈︎`

  1. Launch Chrome in debug mode (enable DevTools protocol on 127.0.0.1:9222).
  2. Run the browser monitor and manually perform the target actions to capture browser state.
  3. Specify your task and run the routine discovery script; the agent reverse‑engineers the API flow.
  4. Review and run/test the generated routine JSON (locally).
  5. Go to console.vectorly.app and productionize your routines!

What is a Routine?

A Routine is a portable automation recipe that captures how to perform a specific task in any web app.

Define once. Reuse everywhere. Automate anything you can do in a browser.

Each Routine includes:

  • name — a human-readable identifier
  • description — what the Routine does
  • parameters — input values the Routine needs to run (e.g. URLs, credentials, text)
  • operations — the ordered browser actions that perform the automation

Example:

Navigate to a dashboard, search based on keywords, and return results — all as a reusable Routine.

Quickstart

2025.11_webhacker_quickstart.mp4

Parameters

  • Defined as typed inputs (see Parameter class).
  • Each parameter has required name and description fields. Optional fields include type (defaults to string), required (defaults to true), default, and examples.
  • Parameters are referenced inside operations using placeholder tokens like "{{paramName}}" or \"{{paramName}}\" (see Placeholder Interpolation below).
  • Parameter Types: Supported types include string, integer, number, boolean, date, datetime, email, url, and enum.
  • Parameter Validation: Parameters support validation constraints such as min_length, max_length, min_value, max_value, pattern (regex), enum_values, and format.
  • Reserved Prefixes: Parameter names cannot start with reserved prefixes: sessionStorage, localStorage, cookie, meta, uuid, epoch_milliseconds.

Operations

Operations define the executable steps of a Routine. They are represented as a typed list (see RoutineOperationUnion) and are executed sequentially by a browser.

Each operation specifies a type and its parameters:

Navigation

  • navigate — open a URL in the browser.
    { "type": "navigate", "url": "https://example.com", "sleep_after_navigation_seconds": 3.0 }
  • sleep — pause execution for a given duration (in seconds).
    { "type": "sleep", "timeout_seconds": 1.5 }
  • wait_for_url — wait for the current URL to match a regex pattern.
    { "type": "wait_for_url", "url_regex": ".*dashboard.*", "timeout_ms": 20000 }

Network

  • fetch — perform an HTTP request defined by an endpoint object (method, URL, headers, body, credentials). Optionally, store the response under a session_storage_key.
    {
      "type": "fetch",
      "endpoint": {
        "method": "GET",
        "url": "https://api.example.com",
        "headers": {},
        "body": {},
        "credentials": "same-origin"
      },
      "session_storage_key": "userData"
    }
  • download — download a file and return it as base64-encoded content.
    {
      "type": "download",
      "endpoint": {
        "method": "GET",
        "url": "https://example.com/report.pdf",
        "headers": {},
        "body": {}
      },
      "filename": "report.pdf"
    }
  • get_cookies — retrieve all cookies (including HttpOnly) via CDP and store them in session storage.
    { "type": "get_cookies", "session_storage_key": "allCookies", "domain_filter": "*" }

Interaction

  • click — click on an element by CSS selector. Automatically validates visibility to avoid honeypot traps.
    { "type": "click", "selector": "#submit-button", "button": "left", "ensure_visible": true }
  • input_text — type text into an input element. Validates visibility before typing.
    { "type": "input_text", "selector": "#username", "text": "\"{{username}}\"", "clear": false }
  • press — press a keyboard key (enter, tab, escape, etc.).
    { "type": "press", "key": "enter" }
  • scroll — scroll the page or a specific element.
    { "type": "scroll", "selector": "#content", "delta_y": 500, "behavior": "auto" }

Code Execution

  • js_evaluate — evaluate custom JavaScript code in the browser context. Must be wrapped in an IIFE format.
    {
      "type": "js_evaluate",
      "js": "(function() { return document.title; })()",
      "timeout_seconds": 5.0,
      "session_storage_key": "pageTitle"
    }

Data

  • return — return the value previously stored under a session_storage_key.
    { "type": "return", "session_storage_key": "userData" }
  • return_html — return HTML content from the page or a specific element.
    { "type": "return_html", "scope": "page" }

Example sequence:

[
  { "type": "navigate", "url": "https://example.com/login" },
  { "type": "sleep", "timeout_seconds": 1 },
  { 
    "type": "fetch", 
    "endpoint": { 
      "method": "POST", 
      "url": "/auth", 
      "body": { "username": "\"{{user}}\"", "password": "\"{{pass}}\"" } 
    }, 
    "session_storage_key": "token" 
  },
  { "type": "return", "session_storage_key": "token" }
]

This defines a deterministic flow: open → wait → authenticate → return a session token.

Placeholder Interpolation {{...}}

Placeholders inside operation fields are resolved at runtime:

  • Parameter placeholders: "{{paramName}}" or \"{{paramName}}\" → substituted from routine parameters
  • Storage placeholders (read values from the current session):
    • {{sessionStorage:myKey.path.to.value}} — access nested values in sessionStorage
    • {{localStorage:myKey}} — access localStorage values
    • {{cookie:CookieName}} — read cookie values
    • {{meta:name}} — read meta tag content (e.g., <meta name="csrf-token">)

Important: Currently, sessionStorage, localStorage, cookie, and meta placeholder resolution is supported only inside fetch headers and body. Future versions will support interpolation anywhere in operations.

Interpolation occurs before an operation executes. For example, a fetch endpoint might be:

{
  "type": "fetch",
  "endpoint": {
    "method": "GET",
    "url": "https://api.example.com/search?paramName1=\"{{paramName1}}\"&paramName2=\"{{paramName1}}\"",
    "headers": {
      "Authorization": "Bearer {{cookie:auth_token}}"
    },
    "body": {}
  },
  "session_storage_key": "result_key"
}

This substitutes parameter values and injects auth_token from cookies. The JSON response is stored under sessionStorage['result_key'] and can be returned by a final return operation using the matching session_storage_key.

Prerequisites

  • Python 3.12+
  • Google Chrome (stable)
  • uv (Python package manager) (optional, for development)
    • macOS/Linux: curl -LsSf https://astral.sh/uv/install.sh | sh
    • Windows (PowerShell): iwr https://astral.sh/uv/install.ps1 -UseBasicParsing | iex
  • OpenAI API key

Installation

From PyPI (Recommended)

Note: We recommend using a virtual environment to avoid dependency conflicts.

# Create and activate a virtual environment
# Option 1: Using uv (recommended - handles Python version automatically)
uv venv web-hacker-env
source web-hacker-env/bin/activate  # On Windows: web-hacker-env\Scripts\activate
uv pip install web-hacker

# Option 2: Using python3 (if Python 3.12+ is your default)
python3 -m venv web-hacker-env
source web-hacker-env/bin/activate  # On Windows: web-hacker-env\Scripts\activate
pip install web-hacker

# Option 3: Using pyenv (if you need a specific Python version)
pyenv install 3.12.3  # if not already installed
pyenv local 3.12.3
python -m venv web-hacker-env
source web-hacker-env/bin/activate  # On Windows: web-hacker-env\Scripts\activate
pip install web-hacker

# Troubleshooting: If pip is not found, recreate the venv or use:
python -m ensurepip --upgrade  # Install pip in the venv
pip install web-hacker

From Source (Development)

For development or if you want the latest code:

# Clone the repository
git clone https://github.com/VectorlyApp/web-hacker.git
cd web-hacker

# Create and activate virtual environment
python3 -m venv web-hacker-env
source web-hacker-env/bin/activate  # On Windows: web-hacker-env\Scripts\activate

# Install in editable mode
pip install -e .

# Or using uv (faster)
uv venv web-hacker-env
source web-hacker-env/bin/activate
uv pip install -e .

Quickstart (Easiest Way) 🚀

The fastest way to get started is using the quickstart script, which automates the entire workflow:

# Make sure web-hacker is installed
pip install web-hacker

# Set your OpenAI API key
export OPENAI_API_KEY="sk-..."

# Run the quickstart script
python quickstart.py

The quickstart script will:

  1. ✅ Automatically launch Chrome in debug mode
  2. 📊 Start browser monitoring (you perform actions)
  3. 🤖 Discover routines from captured data
  4. 📝 Show you how to execute the discovered routine

Note: The quickstart script is included in the repository. If you installed from PyPI, you can download it from the GitHub repository.

Launch Chrome in Debug Mode 🐞

💡 Tip: The quickstart script automatically launches Chrome for you. You only need these manual instructions if you're not using the quickstart script.

macOS

# Create temporary Chrome user directory
mkdir -p $HOME/tmp/chrome

# Launch Chrome in debug mode
"/Applications/Google Chrome.app/Contents/MacOS/Google Chrome" \
  --remote-debugging-address=127.0.0.1 \
  --remote-debugging-port=9222 \
  --user-data-dir="$HOME/tmp/chrome" \
  --remote-allow-origins='*' \
  --no-first-run \
  --no-default-browser-check

# Verify Chrome is running
curl http://127.0.0.1:9222/json/version

Windows

# Create temporary Chrome user directory
New-Item -ItemType Directory -Force -Path "$env:USERPROFILE\tmp\chrome" | Out-Null

# Locate Chrome
$chrome = "C:\Program Files\Google\Chrome\Application\chrome.exe"
if (!(Test-Path $chrome)) {
  $chrome = "C:\Program Files (x86)\Google\Chrome\Application\chrome.exe"
}

# Launch Chrome in debug mode
& $chrome `
  --remote-debugging-address=127.0.0.1 `
  --remote-debugging-port=9222 `
  --user-data-dir="$env:USERPROFILE\tmp\chrome" `
  --remote-allow-origins=* `
  --no-first-run `
  --no-default-browser-check

# Verify Chrome is running
(Invoke-WebRequest http://127.0.0.1:9222/json/version).Content

Linux

# Create temporary Chrome user directory
mkdir -p $HOME/tmp/chrome

# Launch Chrome in debug mode (adjust path if needed)
google-chrome \
  --remote-debugging-address=127.0.0.1 \
  --remote-debugging-port=9222 \
  --user-data-dir="$HOME/tmp/chrome" \
  --remote-allow-origins='*' \
  --no-first-run \
  --no-default-browser-check

# Verify Chrome is running
curl http://127.0.0.1:9222/json/version

HACK (reverse engineer) WEB APPS 👨🏻‍💻

The reverse engineering process follows a simple three-step workflow:

  1. Monitor — Capture network traffic, storage events, and interactions while you manually perform the target task in Chrome
  2. Discover — Let the AI agent analyze the captured data and generate a reusable Routine
  3. Execute — Run the discovered Routine with different parameters to automate the task

Quick Start (Recommended)

Easiest way: Use the quickstart script which automates the entire workflow.

Manual Workflow (Step-by-Step)

Each step is detailed below. Start by ensuring Chrome is running in debug mode (see Launch Chrome in Debug Mode above).

0. Legal & Privacy Notice ⚠️

Reverse-engineering and automating a website can violate terms of service. Store captures securely and scrub any sensitive fields before sharing.

1. Monitor Browser While Performing Some Task

Use the CDP browser monitor to block trackers and capture network, storage, and interaction data while you manually perform the task in Chrome.

Run this command to start monitoring:

web-hacker-monitor --host 127.0.0.1 --port 9222 --output-dir ./cdp_captures --url about:blank --incognito

The script will open a new tab (starting at about:blank). Navigate to your target website, then manually perform the actions you want to automate (e.g., search, login, export report). Keep Chrome focused during this process. Press Ctrl+C and the script will consolidate transactions and produce a HAR automatically.

Output structure (under --output-dir, default ./cdp_captures):

cdp_captures/
├── session_summary.json
├── network/
│   ├── consolidated_transactions.json
│   ├── network.har
│   └── transactions/
│       └── <timestamp_url_id>/
│           ├── request.json
│           ├── response.json
│           └── response_body.[ext]
└── storage/
    └── events.jsonl

Tip: Keep Chrome focused while monitoring and perform the target flow (search, checkout, etc.). Press Ctrl+C to stop; the script will consolidate transactions and produce a HTTP Archive (HAR) automatically.

2. Run Routine-Discovery Agent (Our Very Smart AI with Very Good Prompts🔮)🤖

Use the routine-discovery pipeline to analyze captured data and synthesize a reusable Routine (navigate → fetch → return).

Prerequisites: You’ve already captured a session with the browser monitor (./cdp_captures exists).

Run the discovery agent:

⚠️ Important: You must specify your own --task parameter. The example below is just for demonstration—replace it with a description of what you want to automate.

Linux/macOS (bash):

web-hacker-discover \
  --task "Recover API endpoints for searching for trains and their prices" \
  --cdp-captures-dir ./cdp_captures \
  --output-dir ./routine_discovery_output \
  --llm-model gpt-5.1

Windows (PowerShell):

# Simple task (no quotes inside):
web-hacker-discover --task "Recover the API endpoints for searching for trains and their prices" --cdp-captures-dir ./cdp_captures --output-dir ./routine_discovery_output --llm-model gpt-5.1

Example tasks:

  • "recover the api endpoints for searching for trains and their prices" (shown above)
  • "discover how to search for flights and get pricing"
  • "find the API endpoint for user authentication"
  • "extract the endpoint for submitting a job application"

Arguments:

  • --task: A clear description of what you want to automate. This guides the AI agent to identify which network requests to extract and convert into a Routine. Examples: searching for products, booking appointments, submitting forms, etc.
  • --cdp-captures-dir: Root of prior CDP capture output (default: ./cdp_captures)
  • --output-dir: Directory to write results (default: ./routine_discovery_output)
  • --llm-model: LLM to use for reasoning/parsing (default: gpt-5.1)

Outputs (under --output-dir):

routine_discovery_output/
├── identified_transactions.json    # Chosen transaction id/url
├── routine_transactions.json       # Slimmed request/response samples given to LLM
├── resolved_variables.json         # Resolution hints for cookies/tokens (if any)
└── routine.json                    # Final Routine model (name, parameters, operations)

3. Execute the Discovered Routines 🏃

⚠️ Prerequisite: Make sure Chrome is still running in debug mode (see Launch Chrome in Debug Mode above). The routine execution script connects to the same Chrome debug session on 127.0.0.1:9222.

⚠️ Important: If you have a string-typed parameter used in a JSON body field, it may need to be escaped. When the agent generates routines, string parameters are sometimes placed as "{{PARAM}}" when they should be "\"{{PARAM}}\"" to ensure proper JSON string escaping.

Example: If you see:

"field": "{{paramName}}"

And paramName is a string parameter, manually change it to:

"field": "\"{{paramName}}\""

This ensures the parameter value is properly quoted as a JSON string when substituted.

Run the example routine:

# Using a parameters file:

web-hacker-execute \
  --routine-path example_routines/amtrak_one_way_train_search_routine.json \
  --parameters-path example_routines/amtrak_one_way_train_search_input.json

# Or pass parameters inline (JSON string):

web-hacker-execute \
  --routine-path example_routines/amtrak_one_way_train_search_routine.json \
  --parameters-dict '{"origin": "BOS", "destination": "NYP", "departureDate": "2026-03-22"}'

Run a discovered routine:

web-hacker-execute \
  --routine-path routine_discovery_output/routine.json \
  --parameters-path routine_discovery_output/test_parameters.json

Note: Routines execute in a new incognito tab by default (controlled by the routine's incognito field). This ensures clean sessions for each execution.

Alternative: Deploy your routine to console.vectorly.app to expose it as an API endpoint or MCP tool for use in production environments.

Common Issues ⚠️

  • Chrome not detected / cannot connect to DevTools

    • Ensure Chrome is launched in debug mode and http://127.0.0.1:9222/json/version returns JSON.
    • Check --host/--port flags match your Chrome launch args.
  • OPENAI_API_KEY not set

    • Export the key in your shell or create a .env file and run via uv run (dotenv is loaded).
  • No such file or directory: './cdp_captures/network/transactions/N/A' or similar transaction path errors

    • The agent cannot find any network transactions relevant to your task. This usually means:

      • The --task description doesn't match what you actually performed during monitoring
      • The relevant network requests weren't captured (they may have been blocked or filtered)
      • The task description is too vague or too specific
    • Fix: Reword your --task parameter to more accurately describe what you did during the monitoring step, or re-run the browser monitor and ensure you perform the exact actions you want to automate.

Python SDK 🐍

For programmatic control, use the Python SDK instead of CLI commands:

Basic Usage

from web_hacker.sdk import WebHacker
from web_hacker.data_models.routine.routine import Routine

# Initialize (uses OPENAI_API_KEY from environment)
hacker = WebHacker()

# Load and execute an existing routine
routine = Routine.model_validate_json(open("routine.json").read())
result = hacker.execute_routine(
    routine=routine,
    parameters={"origin": "NYC", "destination": "LAX", "date": "2026-03-15"}
)

if result.ok:
    print(result.data)  # API response data

Full Workflow

import json
from web_hacker.sdk import WebHacker, BrowserMonitor

hacker = WebHacker()

# Step 1: Monitor browser activity
monitor = BrowserMonitor(output_dir="./captures")
monitor.start()
# ... user performs actions in browser ...
input("Press Enter when done")
monitor.stop()

# Step 2: Discover routine from captures
routine = hacker.discover_routine(
    task="Search for flights and get prices",
    cdp_captures_dir="./captures",
    output_dir="./output"
)

# Step 3: Test with generated test parameters
test_params = json.load(open("./output/test_parameters.json"))
result = hacker.execute_routine(routine=routine, parameters=test_params)

# Step 4: Execute with new parameters
result = hacker.execute_routine(
    routine=routine,
    parameters={"origin": "SFO", "destination": "JFK", "date": "2026-04-01"}
)

SDK Classes

Class Description
WebHacker Main client for the full workflow
BrowserMonitor Capture browser network/storage activity
RoutineDiscovery Discover routines from captured data
RoutineExecutor Execute routines programmatically

See quickstart.py for a complete interactive example.

Coming Soon 🔮

Pipeline Improvements

  • Integration of routine testing into the agentic pipeline
    • The agent will execute discovered routines, detect failures, and automatically suggest/fix issues to make routines more robust and efficient.
  • Checkpointing progress and resumability
    • Avoid re-running the entire discovery pipeline after exceptions; the agent will checkpoint progress and resume from the last successful stage.
  • Parameter resolution visibility
    • During execution, show which placeholders (e.g., {{sessionStorage:...}}, {{cookie:...}}, {{localStorage:...}}) resolved successfully and which failed.

Additional Operations (Not Yet Implemented)

Navigation

  • wait_for_title — wait for the page title to match a regex pattern

Network

  • network_sniffing (background operation) — intercept and capture network requests matching a URL pattern in the background while other operations execute. Useful for capturing API calls triggered by UI interactions.
    • Supports different capture modes: list (all matching requests), first (only first match), last (only last match)
    • Can capture request, response, or body data

Interaction

  • hover — move mouse over an element to trigger hover states
  • wait_for_selector — wait for an element to reach a specific state (visible, hidden, attached, detached)
  • set_files — set file paths for file input elements (for file uploads)

Data

  • return_screenshot — capture and return a screenshot of the page as base64

Contributing 🤝

We welcome contributions! Here's how to get started:

  1. Report bugs or request features — Open an issue
  2. Submit code — Fork the repo and open a pull request
  3. Test your code — Add unit tests and make sure all tests pass:
pytest tests/ -v

Please follow existing code style and include tests for new features.