search_code

Codebase-wide semantic search that understands what code does, not just what it's named.

search_code is the core tool. It understands what code does, not just what it's named, and returns complete source, file paths, line numbers, the call graph (callers + callees), and neighbouring functions — in one call. It replaces grep, find, glob, and manual file reading for understanding code.

Inputs

Name	Type	Required	Default	Description
`query`	string	✓	—	Natural-language description of the behaviour you're looking for. Describe what the code does, not its identifier.
`repo`	string		auto	Repository in `owner/repo` format. Omit to auto-select when only one repo is indexed, or to use the server's pinned default.
`branch`	string		default	Git branch to search. Only needed if you indexed multiple branches of the same repo.
`cwd`	string		—	Absolute path to the user's working directory. When provided, the server runs `git branch` there to auto-detect which branch to search.
`top_k`	integer		`5`	Number of results to return. Use 3 for a targeted lookup, 10–15 for broad exploration. Clamped to 50.
`depth`	integer		`1`	How far to expand context around the top result. `0` = just matches, `1` = direct callers/callees, `2` = two levels. Clamped to 3.

Writing good queries

Describe behaviour, not names:

Good	Bad
`function that validates email format before signup`	`validateEmail`
`middleware that checks authentication on API routes`	`authMiddleware`
`error handling in payment processing`	`try catch payment`

What you get back

Results are returned as a tiered summary, so the response stays small:

Rank #1 — location, line count, signature, up to 4 docstring lines, and the full call graph (CALLS →, CALLED BY ←, SAME FILE).
Ranks #2–5 — location, signature, and a short docstring snippet. No call graph.
Ranks #6+ — the most compact form: location + signature only.

Each result carries a normalised relevance label (Strong, Good, Moderate, Weak) so the agent can stop early. Full source for any truncated result is one expand_result call away.

Behind the scenes, search_code uses hybrid retrieval: if your query contains identifier-shaped tokens (PascalCase, camelCase, snake_case, dotted paths, file fragments), it blends semantic similarity with direct name/path matches so an exact name floats to the top. Plain natural-language queries fall back to pure semantic search. See How Clean reduces cost.

Example

"Find the function that handles login redirects"

The agent calls:

{
  "name": "search_code",
  "arguments": {
    "query": "function that handles login redirects after authentication",
    "top_k": 5,
    "depth": 1
  }
}

and receives a ranked, tiered summary with rank #1's callers and callees attached.

Notes

If the index is stale (the underlying repo changed), the local edition fires a non-blocking re-index and returns results against the current index immediately — you're never blocked waiting for re-embedding.
Search has a 30-second timeout; simplify the query if you hit it.
A footer reminds the agent that rank #1 has the most detail and to only search again for a genuinely different concept.