Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
138 changes: 134 additions & 4 deletions examples/AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@

This guide covers how to create custom WebMCP tools that run in the browser context and are available to the LLM for execution.

> **Spec Alignment**: This implementation aligns with the [WebMCP proposed spec](https://github.com/webmachinelearning/webmcp/blob/main/docs/proposal.md).

## Basic Structure

Every WebMCP tool is a JavaScript file with two required exports:
Expand All @@ -24,8 +26,10 @@ export const metadata = {
},
};

export async function execute(args = {}) {
// Per WebMCP spec: execute receives (args, agent) where agent provides requestUserInteraction()
export async function execute(args = {}, agent) {
// Tool implementation - has full DOM/window access
// Use agent.requestUserInteraction() for user confirmation flows
// Return result object
}
```
Expand Down Expand Up @@ -129,8 +133,13 @@ inputSchema: {

## The `execute` Function

Per the WebMCP spec, execute receives two arguments:

- `args` - The tool arguments from the LLM
- `agent` - An agent context object with `requestUserInteraction()` for user confirmation flows

```javascript
export async function execute(args = {}) {
export async function execute(args = {}, agent) {
// Destructure with defaults
const { limit = 100, format = 'full' } = args;

Expand All @@ -141,6 +150,36 @@ export async function execute(args = {}) {
}
```

### Requesting User Interaction

For tools that perform sensitive actions (purchases, deletions, etc.), use `agent.requestUserInteraction()`:

```javascript
export async function execute(args = {}, agent) {
const { productId } = args;

// Request user confirmation before sensitive action
const confirmed = await agent.requestUserInteraction(async () => {
return confirm(`Purchase product ${productId}?\nClick OK to confirm.`);
});

if (!confirmed) {
throw new Error('Purchase cancelled by user.');
}

// Proceed with action...
await executePurchase(productId);
return { success: true, productId };
}
```

The `requestUserInteraction` API:

- Takes an async function that performs the UI interaction
- Returns the result of that function
- Allows tools to prompt for confirmation, input, or any other user interaction
- The agent (browser) handles pausing execution while waiting for user input

### Available in `execute`:

- Full DOM access: `document`, `window`
Expand Down Expand Up @@ -414,6 +453,50 @@ Single-page apps may not update `window.location` when navigating. Your tool may
- Detect context from DOM state, not just URL
- Handle cases where URL shows one view but DOM shows another

## Programmatic Tool Registration

For advanced use cases (like dynamically discovering tools), you can register tools programmatically:

```javascript
// Per WebMCP spec: navigator.modelContext is the primary API
// window.agent is also available as a backward-compatible alias

if ('modelContext' in navigator) {
// Register a single tool
navigator.modelContext.registerTool({
name: 'my_tool',
description: 'Does something useful',
inputSchema: { type: 'object', properties: {} },
execute: async (args, agent) => {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: async is optional there.

return { result: 'success' };
}
});

// Unregister a tool
navigator.modelContext.unregisterTool('my_tool');

// Clear all tools
navigator.modelContext.clearContext();

// Replace all tools at once
navigator.modelContext.provideContext({
tools: [
{ name: 'tool1', description: '...', inputSchema: {...}, execute: async (args, agent) => {...} },
{ name: 'tool2', description: '...', inputSchema: {...}, execute: async (args, agent) => {...} }
]
});
}
```

**API methods:**

- `provideContext({ tools })` - Replace entire tool set (clears existing)
- `registerTool(tool)` - Add/replace a single tool
- `unregisterTool(name)` - Remove a tool by name
- `clearContext()` - Remove all tools
- `listTools()` - Get current tool definitions (without execute functions)
- `callTool(name, args)` - Invoke a tool (used by the agent, not typically by tools)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be renamed to executeTool?


## Testing Your Tool

1. Open browser DevTools console on the target site
Expand Down Expand Up @@ -447,7 +530,8 @@ export const metadata = {
},
};

export async function execute() {
// agent param is optional if not using requestUserInteraction
export async function execute(args = {}, agent) {
const docId = getDocId();
if (!docId) {
throw new Error('Could not extract document ID from URL.');
Expand All @@ -473,6 +557,52 @@ function getDocId() {
}
```

## Example: Tool with User Interaction

```javascript
'use webmcp-tool v1';

export const metadata = {
name: 'delete_item',
namespace: 'myapp',
version: '1.0.0',
description: 'Delete an item after user confirmation.',
match: 'https://app.example.com/*',
inputSchema: {
type: 'object',
properties: {
itemId: { type: 'string', description: 'ID of item to delete' },
},
required: ['itemId'],
additionalProperties: false,
},
};

export async function execute(args = {}, agent) {
const { itemId } = args;

// Request user confirmation before destructive action
const confirmed = await agent.requestUserInteraction(async () => {
return confirm(`Are you sure you want to delete item ${itemId}?`);
});

if (!confirmed) {
return { cancelled: true, message: 'Deletion cancelled by user.' };
}

const response = await fetch(`https://app.example.com/api/items/${itemId}`, {
method: 'DELETE',
credentials: 'include',
});

if (!response.ok) {
throw new Error(`Delete failed: ${response.status}`);
}

return { success: true, itemId };
}
```

## Example: Tool with API Calls and Entity Resolution

```javascript
Expand All @@ -493,7 +623,7 @@ export const metadata = {
},
};

export async function execute(args = {}) {
export async function execute(args = {}, agent) {
const { limit = 50 } = args;

const threadId = getThreadId();
Expand Down
12 changes: 8 additions & 4 deletions examples/shopify_storefront.js
Original file line number Diff line number Diff line change
Expand Up @@ -146,15 +146,19 @@ function registerShopifyTool(toolSchema, mcpEndpoint) {
inputSchema: toolSchema.inputSchema || { type: 'object', properties: {} },

// Create an execute function that proxies to MCP
// Per WebMCP spec: execute receives (args, agent)
execute: createExecutor(toolSchema.name, mcpEndpoint)
};

// Register with the agent
if (window.agent && typeof window.agent.registerTool === 'function') {
window.agent.registerTool(tool);
// Register with modelContext (per WebMCP spec)
// Falls back to window.agent for backward compat
const modelContext = ('modelContext' in navigator) ? navigator.modelContext : window.agent;

if (modelContext && typeof modelContext.registerTool === 'function') {
modelContext.registerTool(tool);
console.log(`[Shopify Bootstrap] ✅ Successfully registered: ${toolName}`);
} else {
throw new Error('window.agent.registerTool not available');
throw new Error('navigator.modelContext.registerTool not available');
}
}

Expand Down
4 changes: 2 additions & 2 deletions package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion package.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"name": "ai-sidebar-extension",
"version": "0.5.0",
"version": "0.6.0",
"description": "Chrome extension AI sidebar with LLM providers and MCP support",
"private": true,
"type": "module",
Expand Down
68 changes: 53 additions & 15 deletions src/content-scripts/page-bridge.js
Original file line number Diff line number Diff line change
@@ -1,7 +1,10 @@
/**
* WebMCP Page Bridge (MAIN World)
* Bridges window.agent events to the extension via postMessage
* Bridges navigator.modelContext events to the extension via postMessage
* Executes in MAIN world with full access to page runtime
*
* Aligns with WebMCP proposed spec:
* https://github.com/webmachinelearning/webmcp/blob/main/docs/proposal.md
*/
(function () {
'use strict';
Expand All @@ -24,18 +27,52 @@
}

/**
* Initialize bridge if window.agent exists
* Get the agent-side API for discovering and executing tools
*
* Chrome's WebMCP implementation:
* - navigator.modelContextTesting = agent-side (listTools, executeTool, registerToolsChangedCallback)
* - navigator.modelContext = page-side (registerTool, unregisterTool, provideContext)
*
* Our polyfill provides the same separation.
*/
function getAgentAPI() {
// Use modelContextTesting (agent-side API) - either native Chrome or our polyfill
if ('modelContextTesting' in navigator) {
return {
native: !Object.prototype.hasOwnProperty.call(navigator.modelContextTesting, 'errors'), // Native won't have our errors property
listTools: () => navigator.modelContextTesting.listTools(),
// executeTool expects args as JSON string
executeTool: (name, args) => navigator.modelContextTesting.executeTool(name, JSON.stringify(args)),
registerToolsChangedCallback: (callback) => navigator.modelContextTesting.registerToolsChangedCallback(callback)
};
}
// Legacy fallback: window.agent (backward compat API)
if ('agent' in window) {
return {
native: false,
listTools: () => window.agent.listTools(),
executeTool: (name, args) => window.agent.callTool(name, args),
registerToolsChangedCallback: (callback) => window.agent.addEventListener('tools/listChanged', callback)
};
}
return null;
}

/**
* Initialize bridge using the agent-side API
*/
function initBridge() {
if (!window.agent || typeof window.agent.addEventListener !== 'function') {
console.warn('[WebMCP Bridge] window.agent not found or invalid');
const agentAPI = getAgentAPI();

if (!agentAPI) {
console.warn('[WebMCP Bridge] No WebMCP API found (modelContextTesting, modelContext, or agent)');
return;
}

console.log('[WebMCP Bridge] Initializing in MAIN world');
console.log('[WebMCP Bridge] Initializing in MAIN world', agentAPI.native ? '(native Chrome API)' : '(polyfill)');

// Get current tools immediately - in case any were registered before we got here
const currentTools = window.agent.listTools();
const currentTools = agentAPI.listTools();
console.log('[WebMCP Bridge] Current tools on init:', currentTools);

// Send initial snapshot if there are already tools
Expand All @@ -54,9 +91,9 @@
}

// Subscribe to tool registry changes
window.agent.addEventListener('tools/listChanged', () => {
agentAPI.registerToolsChangedCallback(() => {
try {
const tools = window.agent.listTools();
const tools = agentAPI.listTools();
postToExtension({
jsonrpc: JSONRPC,
method: 'tools/listChanged',
Expand Down Expand Up @@ -91,7 +128,7 @@
console.log('[WebMCP Bridge] Received tools/list request');

try {
const tools = window.agent.listTools();
const tools = agentAPI.listTools();

// Send tools via notification (not a response to preserve protocol semantics)
postToExtension({
Expand Down Expand Up @@ -128,8 +165,8 @@
const { name, arguments: args } = msg.params || {};

try {
// Delegate to window.agent
const result = await window.agent.callTool(name, args || {});
// Delegate to agent API
const result = await agentAPI.executeTool(name, args || {});

// Send success response
postToExtension({
Expand Down Expand Up @@ -165,12 +202,13 @@
console.log('[WebMCP Bridge] Ready and listening');
}

// Initialize immediately if window.agent exists
if (window.agent) {
// Initialize immediately if any WebMCP API exists
const agentAPI = getAgentAPI();
if (agentAPI) {
initBridge();
} else {
// If window.agent doesn't exist yet, it might be loaded later
// If no API exists yet, it might be loaded later
// This shouldn't happen if polyfill is injected at document_start
console.warn('[WebMCP Bridge] window.agent not found at injection time');
console.warn('[WebMCP Bridge] No WebMCP API found at injection time');
}
})();
Loading