Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
f75b46b
mit-salvage: reintroduce _graph edges accelerator + admin lifecycle p…
voarsh Feb 9, 2026
d4be80a
Admin: fix option to clear indexing caches
voarsh Feb 9, 2026
2e59455
fix(upload-client): stop dev-workspace recursion in dev-remote watch …
voarsh Feb 9, 2026
881f1f4
fix(indexer): harden graph edge backfill + align upload client ignore…
voarsh Feb 9, 2026
7961a72
chore: harden graph-edge ops, cache uploader excludes, and sync helpers
voarsh Feb 9, 2026
2476d6c
collection_admin: Fix missing logging
voarsh Feb 13, 2026
b43054b
Adds debug mode to repo search
voarsh Feb 13, 2026
843d79f
bridge: make MCP list timeouts configurable and gate OAuth metadata f…
voarsh Feb 13, 2026
2f14eff
fix(mcp): correct template dedupe uri source and clean debug field ha…
voarsh Feb 13, 2026
a8656de
vscode-ext: Adds bundled MCP bridge mode
voarsh Feb 13, 2026
6cecc26
Add back Claude Code workflow for GH
voarsh Feb 14, 2026
ec69b2b
Improves upload client and code search handling
voarsh Feb 14, 2026
ba3d336
Prompts for venv creation when auto-detection fails
voarsh Feb 14, 2026
a620125
Updates session defaults on ID change
voarsh Mar 2, 2026
7ed96d9
refactor(bridge): consolidate session defaults sync
voarsh Mar 6, 2026
036c677
fix(search): change `under` filter to recursive subtree scope
voarsh Mar 7, 2026
ba1e9c2
refactor(ingest): add async git history processing and structured log…
voarsh Mar 7, 2026
d30e1c4
fix(vscode-uploader): restore watch startup after successful auto for…
voarsh Mar 7, 2026
24b7c3f
refactor(ingest): improve logging practices and thread safety
voarsh Mar 7, 2026
fb560c1
fix(uploader): restore incremental sync cache and reduce Windows Pyth…
voarsh Mar 7, 2026
8c05f45
feat(upload): add hash-based deduplication and processing status trac…
voarsh Mar 7, 2026
366b6f4
feat(upload): cleanup ignored cached paths and prune empty directorie…
voarsh Mar 7, 2026
0a380b9
feat(upload): add interval-based empty dir sweep and fix force sync i…
voarsh Mar 7, 2026
168f22f
feat(upload): add plan/apply workflow for delta uploads
voarsh Mar 7, 2026
ca32c5a
fix(upload,watch): align cache state with confirmed uploads and trim …
voarsh Mar 8, 2026
673ad7e
fix(ingest,watch): tolerate line shifts and reduce redundant reproces…
voarsh Mar 8, 2026
6f243ec
feat(vscode): extend MCP bridge auto-start to support sse-remote mode
voarsh Mar 8, 2026
ca0c8b3
feat(watch,upload): add index journal for durable change tracking and…
voarsh Mar 9, 2026
c6fcf50
fix(core): improve pagination, upload reliability, and watch consistency
voarsh Mar 9, 2026
ecaf1c1
fix(code review): address critical and major issues from CodeRabbit
github-actions[bot] Mar 9, 2026
86f2212
fix(watch,consistency): improve error handling and retry logic
github-actions[bot] Mar 9, 2026
6bd58ea
fix(ingest,watch,upload): address CodeRabbit critical and major issues
github-actions[bot] Mar 9, 2026
99e9433
refactor(upload_service): extract duplicated collection resolution logic
github-actions[bot] Mar 9, 2026
37890c2
fix(upload,consistency): address CodeRabbit critical and major issues
github-actions[bot] Mar 9, 2026
37349d4
fix(watch,upload): restore internal path checks and async queued uplo…
voarsh Mar 9, 2026
984838d
fix(ingest,watch,upload): improve error handling and smart reindex fa…
voarsh Mar 9, 2026
b349a74
fix(ingest,upload,search): improve error handling and fix edge cases
voarsh Mar 9, 2026
ed627c7
fix(bridge): add retry logic for transient errors in MCP list operations
voarsh Mar 9, 2026
e161c14
ci(cosqa): add benchmark workflow and search matrix runner
voarsh Mar 9, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
68 changes: 68 additions & 0 deletions .github/workflows/claude.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
name: Claude Code

on:
issue_comment:
types: [created]
pull_request_review_comment:
types: [created]
issues:
types: [opened]
pull_request_review:
types: [submitted]
pull_request_target:
types: [opened, synchronize]
Comment on lines +12 to +13
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: pull_request_target with PR head checkout creates a "pwn request" vulnerability.

Using pull_request_target grants write permissions and access to secrets, but checking out github.event.pull_request.head.sha (the untrusted PR code) allows malicious PRs to execute arbitrary code with those elevated privileges. An attacker could modify workflow files or exfiltrate secrets.

Safer alternatives:

  1. Use pull_request event instead (runs in PR context without secrets access)
  2. If pull_request_target is required, only checkout the base branch, not the PR head
  3. Add an explicit approval step before running on external PRs
🔒 Recommended fix: Use pull_request event or avoid checking out PR head
-  pull_request_target:
-    types: [opened, synchronize]
+  pull_request:
+    types: [opened, synchronize]

Or if you need secrets access, don't checkout PR head:

      - name: Checkout repository
        uses: actions/checkout@v4
-        with:
-          # This correctly checks out the PR's head commit for pull_request_target events.
-          ref: ${{ github.event.pull_request.head.sha }}

Also applies to: 38-42

🤖 Prompt for AI Agents
In @.github/workflows/claude.yaml around lines 12 - 13, The workflow uses the
pull_request_target event combined with checking out the PR head
(github.event.pull_request.head.sha), which lets untrusted PR code run with
elevated permissions; change the workflow event to pull_request or, if
pull_request_target is required, update the checkout step to use the base branch
(github.event.pull_request.base.sha) instead of the PR head, or add an explicit
human-approval job that gates any steps requiring secrets; search for the token
usage and the checkout action (references: pull_request_target,
github.event.pull_request.head.sha, github.event.pull_request.base.sha) and
implement one of these safer alternatives.


jobs:
claude:
# This simplified condition is more robust and correctly checks permissions.
if: >
(contains(github.event.comment.body, '@claude') ||
contains(github.event.review.body, '@claude') ||
contains(github.event.issue.body, '@claude') ||
contains(github.event.pull_request.body, '@claude')) &&
(github.event.sender.type == 'User' && (
github.event.comment.author_association == 'OWNER' ||
github.event.comment.author_association == 'MEMBER' ||
github.event.comment.author_association == 'COLLABORATOR'
))
Comment on lines +18 to +27
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Condition logic is incomplete for pull_request_target events.

The author_association check only references github.event.comment.author_association, which is undefined for pull_request_target opened/synchronize events (there's no comment). This causes the job to never run for those triggers even when @claude is in the PR body.

If you intend to support PR-triggered runs, you need to check associations conditionally:

🛠️ Proposed fix to handle different event types
     if: >
       (contains(github.event.comment.body, '@claude') ||
       contains(github.event.review.body, '@claude') ||
       contains(github.event.issue.body, '@claude') ||
       contains(github.event.pull_request.body, '@claude')) &&
-      (github.event.sender.type == 'User' && (
-        github.event.comment.author_association == 'OWNER' ||
-        github.event.comment.author_association == 'MEMBER' ||
-        github.event.comment.author_association == 'COLLABORATOR'
+      github.event.sender.type == 'User' && (
+        github.event.comment.author_association == 'OWNER' ||
+        github.event.comment.author_association == 'MEMBER' ||
+        github.event.comment.author_association == 'COLLABORATOR' ||
+        github.event.review.author_association == 'OWNER' ||
+        github.event.review.author_association == 'MEMBER' ||
+        github.event.review.author_association == 'COLLABORATOR' ||
+        github.event.issue.author_association == 'OWNER' ||
+        github.event.issue.author_association == 'MEMBER' ||
+        github.event.issue.author_association == 'COLLABORATOR' ||
+        github.event.pull_request.author_association == 'OWNER' ||
+        github.event.pull_request.author_association == 'MEMBER' ||
+        github.event.pull_request.author_association == 'COLLABORATOR'
-      ))
+      )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if: >
(contains(github.event.comment.body, '@claude') ||
contains(github.event.review.body, '@claude') ||
contains(github.event.issue.body, '@claude') ||
contains(github.event.pull_request.body, '@claude')) &&
(github.event.sender.type == 'User' && (
github.event.comment.author_association == 'OWNER' ||
github.event.comment.author_association == 'MEMBER' ||
github.event.comment.author_association == 'COLLABORATOR'
))
if: >
(contains(github.event.comment.body, '@claude') ||
contains(github.event.review.body, '@claude') ||
contains(github.event.issue.body, '@claude') ||
contains(github.event.pull_request.body, '@claude')) &&
github.event.sender.type == 'User' && (
github.event.comment.author_association == 'OWNER' ||
github.event.comment.author_association == 'MEMBER' ||
github.event.comment.author_association == 'COLLABORATOR' ||
github.event.review.author_association == 'OWNER' ||
github.event.review.author_association == 'MEMBER' ||
github.event.review.author_association == 'COLLABORATOR' ||
github.event.issue.author_association == 'OWNER' ||
github.event.issue.author_association == 'MEMBER' ||
github.event.issue.author_association == 'COLLABORATOR' ||
github.event.pull_request.author_association == 'OWNER' ||
github.event.pull_request.author_association == 'MEMBER' ||
github.event.pull_request.author_association == 'COLLABORATOR'
)
🤖 Prompt for AI Agents
In @.github/workflows/claude.yaml around lines 18 - 27, The conditional in the
workflow only checks github.event.comment.author_association, so jobs triggered
by pull_request_target events (where the PR body contains "@claude") never pass
because there is no comment object; update the if expression to validate author
association for the correct event payloads by adding checks for
github.event.pull_request.author_association and
github.event.review.author_association (in addition to
github.event.comment.author_association and
github.event.issue.author_association) or conditionally select the association
based on event type (e.g., check github.event_name == 'pull_request' ?
github.event.pull_request.author_association :
github.event.comment.author_association) while preserving the existing
sender.type == 'User' and allowed associations (OWNER, MEMBER, COLLABORATOR) so
PR-open/synchronize events will run when `@claude` is in the PR body.

runs-on: ubuntu-latest
permissions:
# CRITICAL: Write permissions are required for the action to push branches and update issues/PRs.
contents: write
pull-requests: write
issues: write
id-token: write # Required for OIDC token exchange
actions: read # Required for Claude to read CI results on PRs

steps:
- name: Checkout repository
uses: actions/checkout@v4
with:
# This correctly checks out the PR's head commit for pull_request_target events.
ref: ${{ github.event.pull_request.head.sha }}

- name: Create Claude settings file
run: |
mkdir -p /home/runner/.claude
cat > /home/runner/.claude/settings.json << 'EOF'
{
"env": {
"ANTHROPIC_BASE_URL": "https://api.z.ai/api/anthropic",
"ANTHROPIC_AUTH_TOKEN": "${{ secrets.CUSTOM_ENDPOINT_API_KEY }}"
}
}
EOF
Comment on lines +44 to +54
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Redundant settings configuration in two places.

The Claude settings are defined twice: once as a file at /home/runner/.claude/settings.json (lines 46-54) and again as the settings: input to the action (line 64). This duplication creates a maintenance burden—if settings need updating, both locations must change, and discrepancies could cause confusion about which takes precedence.

Consider removing one of these configurations based on what the action actually requires.

🛠️ Option A: Remove the file creation step if the action's `settings` input is sufficient
-      - name: Create Claude settings file
-        run: |
-          mkdir -p /home/runner/.claude
-          cat > /home/runner/.claude/settings.json << 'EOF'
-          {
-            "env": {
-              "ANTHROPIC_BASE_URL": "https://api.z.ai/api/anthropic",
-              "ANTHROPIC_AUTH_TOKEN": "${{ secrets.CUSTOM_ENDPOINT_API_KEY }}"
-            }
-          }
-          EOF
-
       - name: Run Claude Code
🛠️ Option B: Remove the inline settings if the file-based approach is preferred
       - name: Run Claude Code
         id: claude
         uses: anthropics/claude-code-action@v1
         with:
           # Still need this to satisfy the action's validation
           anthropic_api_key: ${{ secrets.CUSTOM_ENDPOINT_API_KEY }}
           
-          # Use the same variable names as your local setup
-          settings: '{"env": {"ANTHROPIC_BASE_URL": "https://api.z.ai/api/anthropic", "ANTHROPIC_AUTH_TOKEN": "${{ secrets.CUSTOM_ENDPOINT_API_KEY }}"}}'
-          
           track_progress: true

Also applies to: 64-64

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/claude.yaml around lines 44 - 54, The workflow currently
duplicates Claude configuration by both creating
/home/runner/.claude/settings.json (the block starting with "mkdir -p
/home/runner/.claude" and the heredoc written to
/home/runner/.claude/settings.json) and providing the same values via the
action's settings: input; remove one source to avoid drift—either delete the
file-creation block and rely on the action's settings: input, or remove the
settings: input and keep the file-based creation—ensure the remaining method
contains the ANTROPIC_BASE_URL/ANTHROPIC_AUTH_TOKEN values and update any
variable names (e.g., ${ { secrets.CUSTOM_ENDPOINT_API_KEY }}) consistently.

- name: Run Claude Code
id: claude
uses: anthropics/claude-code-action@v1
with:
# Still need this to satisfy the action's validation
anthropic_api_key: ${{ secrets.CUSTOM_ENDPOINT_API_KEY }}

# Use the same variable names as your local setup
settings: '{"env": {"ANTHROPIC_BASE_URL": "https://api.z.ai/api/anthropic", "ANTHROPIC_AUTH_TOKEN": "${{ secrets.CUSTOM_ENDPOINT_API_KEY }}"}}'

track_progress: true
claude_args: |
--allowedTools "Bash,Edit,Read,Write,Glob,Grep"
147 changes: 147 additions & 0 deletions .github/workflows/cosqa-benchmark.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,147 @@
name: CoSQA Search Benchmark

on:
workflow_dispatch:
inputs:
enforce_hybrid_gate:
description: Fail run if best hybrid underperforms best dense past threshold
required: false
default: false
type: boolean
hybrid_min_delta:
description: Minimum accepted (hybrid_mrr - dense_mrr), e.g. -0.02
required: false
default: "-0.02"
type: string
upload_full_artifacts:
description: Upload full logs/json bundle (higher storage usage)
required: false
default: false
type: boolean

pull_request:
branches: [ test ]
paths:
- scripts/hybrid/**
- scripts/hybrid_search.py
- scripts/mcp_impl/search.py
- scripts/mcp_impl/context_search.py
- scripts/mcp_indexer_server.py
- scripts/benchmarks/cosqa/**
- .github/workflows/cosqa-benchmark.yml

schedule:
- cron: "25 3 * * *"

jobs:
cosqa-bench:
runs-on: ubuntu-latest
timeout-minutes: 360

services:
qdrant:
image: qdrant/qdrant:v1.15.1
ports:
- 6333:6333

steps:
- name: Checkout
uses: actions/checkout@v4

- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.11"

- name: Cache pip
uses: actions/cache@v4
with:
path: ~/.cache/pip
key: ${{ runner.os }}-pip-${{ hashFiles('**/requirements*.txt', '**/pyproject.toml') }}
restore-keys: |
${{ runner.os }}-pip-

- name: Cache HuggingFace datasets
uses: actions/cache@v4
with:
path: |
~/.cache/huggingface/datasets
~/.cache/huggingface/hub
key: ${{ runner.os }}-hf-cosqa-${{ hashFiles('scripts/benchmarks/cosqa/dataset.py') }}
restore-keys: |
${{ runner.os }}-hf-cosqa-
${{ runner.os }}-hf-

- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
pip install "datasets>=2.18.0"

- name: Wait for Qdrant
run: |
timeout 90 bash -c 'until curl -fsS http://localhost:6333/readyz; do sleep 2; done'
curl -fsS http://localhost:6333/collections >/dev/null

- name: Resolve run config
id: cfg
run: |
echo "profile=full" >> "$GITHUB_OUTPUT"
echo "run_set=full" >> "$GITHUB_OUTPUT"
if [ "${{ github.event_name }}" = "workflow_dispatch" ] && [ "${{ inputs.enforce_hybrid_gate }}" = "true" ]; then
echo "enforce_hybrid_gate=1" >> "$GITHUB_OUTPUT"
else
echo "enforce_hybrid_gate=0" >> "$GITHUB_OUTPUT"
fi
if [ "${{ github.event_name }}" = "workflow_dispatch" ] && [ "${{ inputs.hybrid_min_delta }}" != "" ]; then
echo "hybrid_min_delta=${{ inputs.hybrid_min_delta }}" >> "$GITHUB_OUTPUT"
else
echo "hybrid_min_delta=-0.02" >> "$GITHUB_OUTPUT"
fi

- name: Run CoSQA search matrix
id: bench
env:
QDRANT_URL: http://localhost:6333
PROFILE: ${{ steps.cfg.outputs.profile }}
RUN_SET: ${{ steps.cfg.outputs.run_set }}
ENFORCE_HYBRID_GATE: ${{ steps.cfg.outputs.enforce_hybrid_gate }}
HYBRID_MIN_DELTA: ${{ steps.cfg.outputs.hybrid_min_delta }}
PYTHONUNBUFFERED: "1"
run: |
RUN_TAG="gha-${{ github.run_id }}-${{ github.run_attempt }}"
OUT_DIR="bench_results/cosqa/${RUN_TAG}"
echo "out_dir=${OUT_DIR}" >> "$GITHUB_OUTPUT"
RUN_TAG="${RUN_TAG}" OUT_DIR="${OUT_DIR}" ./scripts/benchmarks/cosqa/run_search_matrix.sh

- name: Publish benchmark summary
if: always()
run: |
SUMMARY="${{ steps.bench.outputs.out_dir }}/summary.md"
if [ -f "${SUMMARY}" ]; then
cat "${SUMMARY}" >> "$GITHUB_STEP_SUMMARY"
else
echo "No summary file generated" >> "$GITHUB_STEP_SUMMARY"
fi

- name: Upload benchmark artifacts
if: always() && github.event_name == 'pull_request'
uses: actions/upload-artifact@v4
with:
name: cosqa-search-summary-${{ github.run_id }}-${{ github.run_attempt }}
path: |
${{ steps.bench.outputs.out_dir }}/summary.md
${{ steps.bench.outputs.out_dir }}/summary.json
retention-days: 3

- name: Upload full benchmark artifacts
if: |
always() && (
github.event_name == 'schedule' ||
(github.event_name == 'workflow_dispatch' && inputs.upload_full_artifacts == true)
)
uses: actions/upload-artifact@v4
with:
name: cosqa-search-bench-${{ github.run_id }}-${{ github.run_attempt }}
path: ${{ steps.bench.outputs.out_dir }}
retention-days: 7
Loading