Chore: Introduce Claude subagents to the repo#5093
Conversation
0e42d92 to
4c42512
Compare
| tools: Glob, Grep, LS, Read, NotebookRead, WebFetch, TodoWrite, WebSearch, Bash | ||
| model: sonnet |
There was a problem hiding this comment.
This comment is for all agents - should we not include tools/model and let the main thread determine what is allowed? I don't think we are ready to be opinionated on a detail like that for users.
There was a problem hiding this comment.
This header is a part of the Claude agent definition format. This is metadata I believe and is not part of the prompt. Just like color.
There was a problem hiding this comment.
This is in the docs:
You have two options for configuring tools:
Omit the tools field to inherit all tools from the main thread (default), including MCP tools
Specify individual tools as a comma-separated list for more granular control (can be edited manually or via /agents)
I read that as the tools provided here override what I have already approved in my main thread.
There was a problem hiding this comment.
Ah, I see. The reason tools are outlined here is because some of the agents are read-only (eg. the code-reviewer one) and this is what captured here.
4c42512 to
61f8a4e
Compare
|
@izeigerman For the public that will use these sub-agents, how did you verify these behave as expected? |
|
@sungchun12 if you can suggest some concrete techniques for evaluating this I'm happy to try them. Ideally, the process would involve having a pool of existing issues which we feed into Claude Code before and after prompt changes. I haven't performed this kind of evaluation. I did feed it a few open issues and saw it perform well with only small comments on my end. I tried the same issues before my changes and it did perform (subjectively) worse. Though, neither the size of the pool of issues nor my evaluation criteria were sufficient imo. We should consider developing a series of synthetic tasks against this repo which we can use to evaluate changes to these prompts. We'd also need to develop evaluation criteria. Both tasks are non-trivial. In the meantime, I suggest people try and use it and then extend the prompts as they agents struggling or running into issues. |
|
The heuristic is enough for me. We'll let people try it out in the wild and iterate from there! |
d76def8 to
97ffa96
Compare
Introduces the following subagents to this repo:
developer- to implement features and fix issuescode-reviewer- to review the code produced by thedeveloperagenttechnical-writer- to update the docs