Skip to main content

Tool Use for Coding

A coding agent's power comes from its tools, not the LLM. The LLM is the reasoner - it decides what to do. The tools are the actuators - they actually do it. Without tools, the LLM can only produce text. With the right tools, it can read and modify real files, execute real code, and interact with real systems.

This means the tools set the capability ceiling. An agent with only read_file and write_file cannot run tests. An agent with bash can do almost anything. An agent with LSP integration can navigate code with the precision of an IDE.

Tool design is where most coding agent implementations fail. The tools are either too narrow (the agent cannot do what it needs to), too powerful without guardrails (the agent can accidentally delete production data), or produce unhelpful error messages (the agent gets confused and loops).

This lesson covers the complete coding agent tool set with production-quality implementations.


The Complete Tool Taxonomy

Safe tools (read operations) have no side effects and can be called freely.

Careful tools (write operations) modify state but are recoverable if you have git.

Risky tools (execution) run arbitrary code and must be sandboxed or confirmed.


:::tip 🎮 Interactive Playground Visualize this concept: Try the Tool Use & Function Calling demo on the EngineersOfAI Playground - no code required. :::

Complete Tool Implementations

Below is a complete, production-quality implementation of the full coding agent tool set.

"""
coding_agent_tools.py - Complete tool set for a coding agent.

Design principles:
1. Every tool returns a string (for LLM consumption)
2. Errors are informative - tell the agent WHAT went wrong and HOW to fix it
3. Read operations are always safe
4. Write operations validate before modifying
5. Execution is sandboxed and has timeouts
6. Git tools are read-mostly; commits require explicit confirmation
"""

import ast
import os
import re
import subprocess
import json
import fnmatch
from pathlib import Path
from typing import Any, Optional
import datetime

# ─────────────────────────────────────────────────────────────────────────────
# Configuration
# ─────────────────────────────────────────────────────────────────────────────

# Directories that should never be modified
PROTECTED_DIRS = {".git", "node_modules", "__pycache__", ".venv", "venv"}

# File extensions that are safe to read (not binaries)
TEXT_EXTENSIONS = {
".py", ".js", ".ts", ".tsx", ".jsx", ".go", ".rs", ".java", ".cpp", ".c",
".h", ".hpp", ".cs", ".rb", ".php", ".swift", ".kt", ".scala", ".sh",
".bash", ".zsh", ".fish", ".ps1", ".html", ".css", ".scss", ".less",
".json", ".yaml", ".yml", ".toml", ".ini", ".cfg", ".conf", ".env",
".md", ".rst", ".txt", ".xml", ".sql", ".graphql", ".proto",
".gitignore", ".dockerignore", ".editorconfig",
}

# Maximum file size to read (bytes)
MAX_READ_SIZE = 500_000 # 500KB

# Default timeouts
DEFAULT_BASH_TIMEOUT = 30
DEFAULT_TEST_TIMEOUT = 120


# ─────────────────────────────────────────────────────────────────────────────
# Path Validation
# ─────────────────────────────────────────────────────────────────────────────

def _validate_path(path: str, must_exist: bool = False) -> tuple[bool, str]:
"""
Validate a path for safety.

Returns (is_valid, error_message).
"""
p = Path(path).resolve()

# Block path traversal
if ".." in str(p):
return False, f"Path traversal not allowed: {path}"

# Block protected directories
for part in p.parts:
if part in PROTECTED_DIRS:
return False, f"Protected directory in path: {part}"

if must_exist and not p.exists():
return False, f"Path does not exist: {path}"

return True, ""


def _is_text_file(path: str) -> bool:
"""Check if a file is a text file (not binary)."""
p = Path(path)
if p.suffix.lower() in TEXT_EXTENSIONS:
return True
# For files without extension or unknown extensions, check content
try:
chunk = p.read_bytes()[:512]
# If chunk has null bytes, it's likely binary
return b"\x00" not in chunk
except Exception:
return False


# ─────────────────────────────────────────────────────────────────────────────
# Read Operations (Safe)
# ─────────────────────────────────────────────────────────────────────────────

def read_file(path: str, start_line: int = 1, end_line: Optional[int] = None) -> str:
"""
Read the contents of a file with line numbers.

Args:
path: Path to the file to read
start_line: First line to read (1-indexed, default 1)
end_line: Last line to read (inclusive, default: all)

Returns:
File contents with line numbers, or an error message.
"""
valid, error = _validate_path(path, must_exist=True)
if not valid:
return f"ERROR: {error}"

p = Path(path)

if not p.is_file():
return f"ERROR: Not a file: {path}\nUse list_directory to see what's at this path."

# Check file size
size = p.stat().st_size
if size > MAX_READ_SIZE:
return (
f"ERROR: File too large to read at once ({size:,} bytes > {MAX_READ_SIZE:,} byte limit).\n"
f"Use start_line and end_line parameters to read specific sections.\n"
f"File has approximately {size // 80} lines."
)

if not _is_text_file(path):
return f"ERROR: Binary file detected: {path}\nCannot read binary files as text."

try:
content = p.read_text(encoding="utf-8", errors="replace")
lines = content.splitlines()
total_lines = len(lines)

# Apply line range
start = max(1, start_line) - 1 # convert to 0-indexed
end = min(total_lines, end_line) if end_line else total_lines

selected = lines[start:end]
numbered = "\n".join(
f"{i + start + 1:5d} | {line}"
for i, line in enumerate(selected)
)

header = f"File: {path}\nLines: {total_lines} total"
if start_line > 1 or end_line:
header += f" (showing {start + 1}{end})"

return f"{header}\n\n{numbered}"

except UnicodeDecodeError:
return f"ERROR: Cannot decode {path} as UTF-8. It may be a binary file."
except PermissionError:
return f"ERROR: Permission denied: {path}"
except Exception as e:
return f"ERROR reading {path}: {e}"


def list_directory(path: str, max_depth: int = 3, show_hidden: bool = False) -> str:
"""
List files and directories in a tree format.

Args:
path: Directory path to list
max_depth: Maximum recursion depth (default 3)
show_hidden: Whether to show hidden files/dirs (default False)

Returns:
Tree representation of the directory, or an error message.
"""
valid, error = _validate_path(path, must_exist=True)
if not valid:
return f"ERROR: {error}"

p = Path(path)
if not p.is_dir():
return f"ERROR: Not a directory: {path}\nUse read_file to read file contents."

IGNORE = PROTECTED_DIRS | {".DS_Store", "Thumbs.db", "*.pyc", "*.pyo"}
lines = [str(p) + "/"]

def _format_size(size: int) -> str:
for unit in ["B", "KB", "MB"]:
if size < 1024:
return f"{size}{unit}"
size //= 1024
return f"{size}GB"

def _walk(directory: Path, prefix: str, depth: int):
if depth > max_depth:
lines.append(f"{prefix}... (max depth reached)")
return

try:
items = sorted(
directory.iterdir(),
key=lambda x: (x.is_file(), x.name.lower()),
)
except PermissionError:
lines.append(f"{prefix}[permission denied]")
return

visible_items = []
for item in items:
if not show_hidden and item.name.startswith("."):
continue
if item.name in IGNORE:
continue
# Skip pyc and similar
skip = False
for pattern in ["*.pyc", "*.pyo", "__pycache__"]:
if fnmatch.fnmatch(item.name, pattern):
skip = True
break
if not skip:
visible_items.append(item)

for i, item in enumerate(visible_items):
is_last = i == len(visible_items) - 1
connector = "└── " if is_last else "├── "
extension = " " if is_last else "│ "

if item.is_dir():
lines.append(f"{prefix}{connector}{item.name}/")
_walk(item, prefix + extension, depth + 1)
else:
size_str = _format_size(item.stat().st_size)
lines.append(f"{prefix}{connector}{item.name} ({size_str})")

_walk(p, "", 1)
return "\n".join(lines)


def search_files(
pattern: str,
directory: str = ".",
file_type: Optional[str] = None,
case_sensitive: bool = True,
max_results: int = 50,
context_lines: int = 2,
) -> str:
"""
Search for a pattern across files using ripgrep (with grep fallback).

Args:
pattern: Regex or literal string to search for
directory: Root directory to search in (default: current directory)
file_type: File extension filter, e.g. 'py', 'ts', 'go' (without dot)
case_sensitive: Whether to match case (default True)
max_results: Maximum results to return (default 50)
context_lines: Lines of context around each match (default 2)

Returns:
Matching lines with file paths and line numbers.
"""
valid, error = _validate_path(directory)
if not valid:
return f"ERROR: {error}"

# Build command - prefer ripgrep for speed
try:
# Check if rg is available
subprocess.run(["rg", "--version"], capture_output=True, check=True)
use_rg = True
except (subprocess.CalledProcessError, FileNotFoundError):
use_rg = False

if use_rg:
cmd = ["rg", "--line-number", f"-A{context_lines}", f"-B{context_lines}"]
if not case_sensitive:
cmd.append("--ignore-case")
if file_type:
cmd.extend(["--type", file_type])
cmd.extend(["--max-count", "1", "--", pattern, directory])
else:
cmd = ["grep", "-rn", "--include=*.py"]
if not case_sensitive:
cmd.append("-i")
if file_type:
cmd[2] = f"--include=*.{file_type}"
cmd.extend(["-A", str(context_lines), "-B", str(context_lines), pattern, directory])

try:
result = subprocess.run(
cmd,
capture_output=True,
text=True,
timeout=30,
)
output = result.stdout

if not output.strip():
return f"No matches found for pattern: {repr(pattern)}\nDirectory: {directory}"

lines = output.split("\n")
if len(lines) > max_results * (context_lines * 2 + 3):
lines = lines[:max_results * (context_lines * 2 + 3)]
output = "\n".join(lines) + f"\n\n[Results truncated to {max_results} matches]"

return output

except subprocess.TimeoutExpired:
return f"ERROR: Search timed out. Try a more specific pattern or smaller directory."
except Exception as e:
return f"ERROR searching: {e}"


def get_file_info(path: str) -> str:
"""
Get metadata about a file without reading its content.

Useful for large files where you need to know size/type before deciding to read.
"""
valid, error = _validate_path(path, must_exist=True)
if not valid:
return f"ERROR: {error}"

p = Path(path)
stat = p.stat()

info = {
"path": str(p),
"type": "directory" if p.is_dir() else "file",
"size_bytes": stat.st_size,
"size_human": f"{stat.st_size:,} bytes",
"extension": p.suffix,
"is_text": _is_text_file(str(p)) if p.is_file() else None,
"modified": datetime.datetime.fromtimestamp(stat.st_mtime).isoformat(),
}

if p.is_file():
try:
content = p.read_text(encoding="utf-8", errors="replace")
lines = content.splitlines()
info["line_count"] = len(lines)
info["char_count"] = len(content)
# Count top-level definitions
if p.suffix == ".py":
try:
tree = ast.parse(content)
classes = [n.name for n in ast.walk(tree) if isinstance(n, ast.ClassDef)]
functions = [
n.name for n in ast.walk(tree)
if isinstance(n, (ast.FunctionDef, ast.AsyncFunctionDef))
]
info["python_classes"] = classes[:10]
info["python_functions"] = functions[:20]
except SyntaxError:
info["python_parse_error"] = True
except UnicodeDecodeError:
info["encoding_error"] = True

lines = [f"{k}: {v}" for k, v in info.items() if v is not None]
return "\n".join(lines)


# ─────────────────────────────────────────────────────────────────────────────
# Write Operations (Careful)
# ─────────────────────────────────────────────────────────────────────────────

def write_file(path: str, content: str, create_dirs: bool = True) -> str:
"""
Write content to a file.

Creates the file if it doesn't exist. Overwrites if it does.
Prefer edit_file for modifying existing files.

Args:
path: File path to write to
content: Content to write
create_dirs: Create parent directories if they don't exist (default True)
"""
valid, error = _validate_path(path)
if not valid:
return f"ERROR: {error}"

p = Path(path)

if p.exists() and p.is_file():
# Warn about overwriting existing files
existing_lines = len(p.read_text(encoding="utf-8", errors="replace").splitlines())
new_lines = len(content.splitlines())
print(f"WARNING: Overwriting existing file {path} ({existing_lines} lines → {new_lines} lines)")

try:
if create_dirs:
p.parent.mkdir(parents=True, exist_ok=True)
p.write_text(content, encoding="utf-8")
line_count = len(content.splitlines())
return f"Successfully wrote {line_count} lines to {path}"
except PermissionError:
return f"ERROR: Permission denied writing to {path}"
except Exception as e:
return f"ERROR writing {path}: {e}"


def edit_file(path: str, old_str: str, new_str: str) -> str:
"""
Make a surgical edit to an existing file using search-and-replace.

old_str must match EXACTLY (including all whitespace and indentation).
Use read_file first to get the exact current content.

Args:
path: File to edit
old_str: Exact string to find and replace. Must appear exactly once.
new_str: Replacement string.
"""
valid, error = _validate_path(path, must_exist=True)
if not valid:
return f"ERROR: {error}"

p = Path(path)
if not p.is_file():
return f"ERROR: Not a file: {path}"

try:
content = p.read_text(encoding="utf-8", errors="replace")
except Exception as e:
return f"ERROR reading {path}: {e}"

# Exact match check
if old_str not in content:
# Generate helpful context for debugging
first_old_line = old_str.split("\n")[0].strip()
similar_lines = []
for i, line in enumerate(content.split("\n"), 1):
if first_old_line[:20] and first_old_line[:20].lower() in line.lower():
similar_lines.append(f" Line {i:4d}: {line[:100]}")
hint = ""
if similar_lines:
hint = "\n\nSimilar lines in file:\n" + "\n".join(similar_lines[:8])
return (
f"ERROR: old_str not found in {path}.\n"
"This usually means the file content differs from what you expect.\n"
"Solution: use read_file to see the current content, then copy the exact text."
f"{hint}"
)

# Uniqueness check
count = content.count(old_str)
if count > 1:
return (
f"ERROR: old_str appears {count} times in {path}.\n"
"The edit is ambiguous. Include more surrounding lines in old_str "
"to make the match unique."
)

# Apply the edit
new_content = content.replace(old_str, new_str, 1)
p.write_text(new_content, encoding="utf-8")

# Report what changed
old_lines = len(old_str.splitlines())
new_lines_count = len(new_str.splitlines())
return (
f"Edit applied to {path}.\n"
f"Replaced {old_lines} line(s) with {new_lines_count} line(s)."
)


def delete_file(path: str, confirm: bool = False) -> str:
"""
Delete a file. Requires confirm=True to prevent accidental deletion.

Args:
path: File to delete
confirm: Must be True to actually delete (safety requirement)
"""
if not confirm:
return (
"ERROR: delete_file requires confirm=True.\n"
"Call again with confirm=True to actually delete the file.\n"
f"File to delete: {path}"
)

valid, error = _validate_path(path, must_exist=True)
if not valid:
return f"ERROR: {error}"

p = Path(path)
if p.is_dir():
return f"ERROR: {path} is a directory. Use bash('rm -rf ...') if you really need this."

try:
p.unlink()
return f"Deleted: {path}"
except Exception as e:
return f"ERROR deleting {path}: {e}"


# ─────────────────────────────────────────────────────────────────────────────
# Execution Tools (Risky)
# ─────────────────────────────────────────────────────────────────────────────

# Commands that should never be allowed in automated contexts
DANGEROUS_PATTERNS = [
r"rm\s+-rf\s+/", # rm -rf /
r":\(\)\{.*\}", # fork bomb
r"dd\s+.*of=/dev/", # write to raw device
r"mkfs\.", # format filesystem
r">\s*/dev/sda", # overwrite disk
r"curl.*\|.*sh", # curl-pipe-sh (allow but warn)
r"wget.*\|.*sh", # wget-pipe-sh (allow but warn)
]

WARNING_PATTERNS = [
r"rm\s+-rf", # recursive delete (allow but warn)
r"git\s+push.*--force", # force push
r"git\s+reset\s+--hard", # hard reset
r"DROP\s+TABLE", # SQL table drop
r"DELETE\s+FROM", # SQL delete
]


def _check_command_safety(command: str) -> tuple[bool, bool, str]:
"""
Check a command for dangerous patterns.

Returns (is_blocked, has_warning, message).
"""
for pattern in DANGEROUS_PATTERNS:
if re.search(pattern, command, re.IGNORECASE):
return True, True, f"Command blocked: matches dangerous pattern '{pattern}'"

warnings = []
for pattern in WARNING_PATTERNS:
if re.search(pattern, command, re.IGNORECASE):
warnings.append(f"WARNING: Command matches risky pattern '{pattern}'")

return False, bool(warnings), "\n".join(warnings)


def bash(
command: str,
timeout: int = DEFAULT_BASH_TIMEOUT,
working_dir: Optional[str] = None,
) -> str:
"""
Run a shell command and return stdout + stderr.

The most powerful tool available. Use carefully.
- Read-only commands (cat, grep, find, ls, git status, git log) are safe
- Write commands (rm, mv, chmod) should be used carefully
- Never run commands that modify system state outside the repo

Args:
command: Shell command to execute
timeout: Timeout in seconds (default 30, max 120)
working_dir: Working directory for the command (default: current)
"""
# Safety check
blocked, has_warning, safety_msg = _check_command_safety(command)
if blocked:
return f"ERROR: {safety_msg}"
if has_warning:
print(safety_msg) # Log warning but allow

timeout = min(timeout, 120) # Cap at 2 minutes

try:
result = subprocess.run(
command,
shell=True,
capture_output=True,
text=True,
timeout=timeout,
cwd=working_dir,
env={**os.environ, "TERM": "dumb"}, # Disable terminal colors
)

output_parts = []
if result.stdout:
output_parts.append(result.stdout)
if result.stderr:
output_parts.append(f"STDERR:\n{result.stderr}")

output = "\n".join(output_parts) if output_parts else "(no output)"

# Truncate very long output
MAX_OUTPUT = 10_000
if len(output) > MAX_OUTPUT:
output = output[:5000] + f"\n\n[... {len(output) - 7000} chars truncated ...]\n\n" + output[-2000:]

# Append return code if non-zero
if result.returncode != 0:
output += f"\n\n[Exit code: {result.returncode}]"

return output

except subprocess.TimeoutExpired:
return f"ERROR: Command timed out after {timeout}s.\nConsider using a more specific command or increasing the timeout."
except Exception as e:
return f"ERROR: {e}"


def run_tests(
test_path: str = ".",
test_filter: Optional[str] = None,
verbose: bool = True,
timeout: int = DEFAULT_TEST_TIMEOUT,
framework: str = "auto",
) -> str:
"""
Run a test suite and return structured results.

Args:
test_path: Path to test file or directory (default: current directory)
test_filter: Test name filter (e.g., 'test_user', 'TestUserAPI::test_create')
verbose: Show individual test names (default True)
timeout: Timeout in seconds (default 120)
framework: Test framework: 'pytest', 'jest', 'cargo', or 'auto' (default)
"""
# Detect framework if auto
if framework == "auto":
if Path("package.json").exists():
framework = "jest"
elif Path("Cargo.toml").exists():
framework = "cargo"
elif Path("go.mod").exists():
framework = "go"
else:
framework = "pytest"

# Build the test command
if framework == "pytest":
cmd = ["python", "-m", "pytest", test_path]
if verbose:
cmd.append("-v")
if test_filter:
cmd.extend(["-k", test_filter])
cmd.extend(["--tb=short", "--no-header"])
elif framework == "jest":
cmd = ["npx", "jest", test_path, "--no-coverage"]
if test_filter:
cmd.extend(["--testNamePattern", test_filter])
if verbose:
cmd.append("--verbose")
elif framework == "cargo":
cmd = ["cargo", "test"]
if test_filter:
cmd.append(test_filter)
elif framework == "go":
cmd = ["go", "test", "./...", "-v"]
if test_filter:
cmd.extend(["-run", test_filter])
else:
return f"ERROR: Unknown test framework: {framework}"

try:
result = subprocess.run(
cmd,
capture_output=True,
text=True,
timeout=timeout,
)

output = result.stdout + ("\nSTDERR:\n" + result.stderr if result.stderr else "")

# Parse results for a summary
summary = _parse_test_results(output, framework)

# Prepend summary
return f"{summary}\n\n{'='*60}\nFull output:\n{'='*60}\n{output}"

except subprocess.TimeoutExpired:
return f"ERROR: Tests timed out after {timeout}s. Run a more targeted test with test_filter."
except Exception as e:
return f"ERROR running tests: {e}"


def _parse_test_results(output: str, framework: str) -> str:
"""Extract a summary line from test output."""
if framework == "pytest":
for line in reversed(output.split("\n")):
if "passed" in line or "failed" in line or "error" in line:
return f"RESULT: {line.strip()}"
elif framework == "jest":
for line in reversed(output.split("\n")):
if "Tests:" in line:
return f"RESULT: {line.strip()}"
return "RESULT: (see output below)"


# ─────────────────────────────────────────────────────────────────────────────
# Git Tools (Version Control)
# ─────────────────────────────────────────────────────────────────────────────

def git_status(repo_path: str = ".") -> str:
"""
Show the current git status (modified, staged, untracked files).
"""
result = subprocess.run(
["git", "status", "--short"],
cwd=repo_path,
capture_output=True,
text=True,
)
if result.returncode != 0:
return f"ERROR: Not a git repository or git not installed.\n{result.stderr}"
return result.stdout or "(working tree clean)"


def git_diff(
repo_path: str = ".",
filepath: Optional[str] = None,
staged: bool = False,
) -> str:
"""
Show a diff of current changes.

Args:
repo_path: Repository root
filepath: Specific file to diff (default: all changed files)
staged: Show staged changes (default: unstaged)
"""
cmd = ["git", "diff"]
if staged:
cmd.append("--staged")
if filepath:
cmd.extend(["--", filepath])

result = subprocess.run(
cmd,
cwd=repo_path,
capture_output=True,
text=True,
)
output = result.stdout
if not output.strip():
return "No changes detected."

# Truncate very large diffs
if len(output) > 20_000:
output = output[:10_000] + "\n\n[... diff truncated ...]\n\n" + output[-3_000:]

return output


def git_commit(
message: str,
repo_path: str = ".",
add_all: bool = False,
) -> str:
"""
Create a git commit.

Args:
message: Commit message
repo_path: Repository root
add_all: Stage all changes before committing (git add -A)
"""
if add_all:
add_result = subprocess.run(
["git", "add", "-A"],
cwd=repo_path,
capture_output=True,
text=True,
)
if add_result.returncode != 0:
return f"ERROR staging files: {add_result.stderr}"

result = subprocess.run(
["git", "commit", "-m", message],
cwd=repo_path,
capture_output=True,
text=True,
)
if result.returncode == 0:
return result.stdout
return f"ERROR: {result.stderr}"


# ─────────────────────────────────────────────────────────────────────────────
# Language Server Tools (Safe)
# ─────────────────────────────────────────────────────────────────────────────

def get_python_symbols(filepath: str) -> str:
"""
Get all symbols defined in a Python file (classes, functions, methods).

Uses AST - no language server required.
"""
valid, error = _validate_path(filepath, must_exist=True)
if not valid:
return f"ERROR: {error}"

try:
source = Path(filepath).read_text(encoding="utf-8", errors="replace")
tree = ast.parse(source)
except SyntaxError as e:
return f"ERROR: Syntax error in {filepath}: {e}"

symbols = []

for node in ast.walk(tree):
if isinstance(node, ast.ClassDef):
methods = []
for child in ast.walk(node):
if isinstance(child, (ast.FunctionDef, ast.AsyncFunctionDef)):
if child is not node: # exclude nested classes
args = [a.arg for a in child.args.args if a.arg not in ("self", "cls")]
ret = ast.unparse(child.returns) if child.returns else ""
methods.append(
f" + {child.name}({', '.join(args[:4])})"
+ (f" -> {ret}" if ret else "")
+ f" [line {child.lineno}]"
)
symbols.append(f"class {node.name} [line {node.lineno}]")
symbols.extend(methods[:20]) # limit per class
elif isinstance(node, (ast.FunctionDef, ast.AsyncFunctionDef)):
# Top-level functions only
args = [a.arg for a in node.args.args if a.arg not in ("self", "cls")]
ret = ast.unparse(node.returns) if node.returns else ""
symbols.append(
f"def {node.name}({', '.join(args[:4])})"
+ (f" -> {ret}" if ret else "")
+ f" [line {node.lineno}]"
)

if not symbols:
return f"No symbols found in {filepath}"

return f"Symbols in {filepath}:\n\n" + "\n".join(symbols)


def find_definition(symbol_name: str, root: str) -> str:
"""
Find where a Python symbol (class, function) is defined across the codebase.
"""
results = []

for filepath in Path(root).rglob("*.py"):
if any(d in filepath.parts for d in PROTECTED_DIRS):
continue

try:
source = filepath.read_text(encoding="utf-8", errors="replace")
tree = ast.parse(source)
except (SyntaxError, UnicodeDecodeError):
continue

for node in ast.walk(tree):
if isinstance(node, (ast.FunctionDef, ast.AsyncFunctionDef, ast.ClassDef)):
if node.name == symbol_name:
parent = None
for parent_node in ast.walk(tree):
if isinstance(parent_node, ast.ClassDef):
for child in ast.walk(parent_node):
if child is node and child is not parent_node:
parent = parent_node.name
location = str(filepath)
if parent:
results.append(f"{location}:{node.lineno} - {parent}.{symbol_name}")
else:
results.append(f"{location}:{node.lineno} - {symbol_name}")

if not results:
return f"No definition found for '{symbol_name}'. Try search_files('{symbol_name}') for broader search."

return f"Definitions of '{symbol_name}':\n" + "\n".join(results)

Tool Registration for the Anthropic API

Now that we have implemented all tools, here is how to register them for use with the Anthropic API:

"""
tool_registry.py - Register all tools for use with the Anthropic API.
"""

from coding_agent_tools import (
read_file, write_file, edit_file, delete_file,
list_directory, search_files, get_file_info,
bash, run_tests,
git_status, git_diff, git_commit,
get_python_symbols, find_definition,
)
from typing import Any

# Tool schemas - these are sent to Claude to describe available tools
TOOL_SCHEMAS = [
{
"name": "read_file",
"description": (
"Read a file's contents with line numbers. "
"Use start_line and end_line to read sections of large files. "
"Always use this before editing a file."
),
"input_schema": {
"type": "object",
"properties": {
"path": {"type": "string", "description": "File path to read"},
"start_line": {"type": "integer", "description": "First line (default 1)"},
"end_line": {"type": "integer", "description": "Last line (optional)"},
},
"required": ["path"],
},
},
{
"name": "edit_file",
"description": (
"Make a surgical edit by replacing an exact string. "
"old_str must appear EXACTLY ONCE in the file (include surrounding context to ensure uniqueness). "
"Always use read_file first to get the exact current content."
),
"input_schema": {
"type": "object",
"properties": {
"path": {"type": "string"},
"old_str": {"type": "string", "description": "Exact string to find (must appear exactly once)"},
"new_str": {"type": "string", "description": "Replacement string"},
},
"required": ["path", "old_str", "new_str"],
},
},
{
"name": "write_file",
"description": "Write or create a file. Use for new files only. For existing files, prefer edit_file.",
"input_schema": {
"type": "object",
"properties": {
"path": {"type": "string"},
"content": {"type": "string", "description": "Complete file content to write"},
},
"required": ["path", "content"],
},
},
{
"name": "bash",
"description": (
"Run a shell command. Use for: running tests, grep, find, git commands, installing packages, "
"checking syntax. Do NOT use for destructive operations."
),
"input_schema": {
"type": "object",
"properties": {
"command": {"type": "string", "description": "Shell command to run"},
"timeout": {"type": "integer", "description": "Timeout in seconds (max 120)"},
"working_dir": {"type": "string", "description": "Working directory"},
},
"required": ["command"],
},
},
{
"name": "search_files",
"description": "Search for a pattern across files using ripgrep. Returns matching lines with context.",
"input_schema": {
"type": "object",
"properties": {
"pattern": {"type": "string", "description": "Regex or literal string"},
"directory": {"type": "string", "description": "Root directory to search"},
"file_type": {"type": "string", "description": "File extension filter (e.g., 'py', 'ts')"},
"case_sensitive": {"type": "boolean", "default": True},
"context_lines": {"type": "integer", "default": 2},
},
"required": ["pattern"],
},
},
{
"name": "list_directory",
"description": "List directory contents as a tree. Use to understand codebase structure.",
"input_schema": {
"type": "object",
"properties": {
"path": {"type": "string"},
"max_depth": {"type": "integer", "default": 3},
},
"required": ["path"],
},
},
{
"name": "run_tests",
"description": "Run the test suite. Returns pass/fail summary and full output.",
"input_schema": {
"type": "object",
"properties": {
"test_path": {"type": "string", "description": "Test file or directory"},
"test_filter": {"type": "string", "description": "Test name filter (pytest -k)"},
"verbose": {"type": "boolean", "default": True},
"framework": {"type": "string", "enum": ["auto", "pytest", "jest", "cargo", "go"]},
},
"required": [],
},
},
{
"name": "git_status",
"description": "Show current git status - which files have been modified.",
"input_schema": {
"type": "object",
"properties": {
"repo_path": {"type": "string", "description": "Repository root"},
},
"required": [],
},
},
{
"name": "git_diff",
"description": "Show a diff of current changes.",
"input_schema": {
"type": "object",
"properties": {
"repo_path": {"type": "string"},
"filepath": {"type": "string", "description": "Specific file to diff"},
"staged": {"type": "boolean", "description": "Show staged changes", "default": False},
},
"required": [],
},
},
{
"name": "find_definition",
"description": "Find where a Python class or function is defined across the codebase.",
"input_schema": {
"type": "object",
"properties": {
"symbol_name": {"type": "string", "description": "Class or function name"},
"root": {"type": "string", "description": "Root directory to search"},
},
"required": ["symbol_name", "root"],
},
},
{
"name": "get_python_symbols",
"description": "Get all symbols (classes, functions, methods) defined in a Python file with line numbers.",
"input_schema": {
"type": "object",
"properties": {
"filepath": {"type": "string"},
},
"required": ["filepath"],
},
},
]


def execute_tool(name: str, inputs: dict[str, Any]) -> str:
"""Dispatch a tool call to the correct implementation."""
dispatch = {
"read_file": lambda i: read_file(**i),
"write_file": lambda i: write_file(**i),
"edit_file": lambda i: edit_file(**i),
"delete_file": lambda i: delete_file(**i),
"list_directory": lambda i: list_directory(**i),
"search_files": lambda i: search_files(**i),
"get_file_info": lambda i: get_file_info(**i),
"bash": lambda i: bash(**i),
"run_tests": lambda i: run_tests(**i),
"git_status": lambda i: git_status(**i),
"git_diff": lambda i: git_diff(**i),
"git_commit": lambda i: git_commit(**i),
"get_python_symbols": lambda i: get_python_symbols(**i),
"find_definition": lambda i: find_definition(**i),
}

fn = dispatch.get(name)
if fn is None:
return f"ERROR: Unknown tool: {name}"

try:
return fn(inputs)
except TypeError as e:
return f"ERROR: Tool '{name}' called with wrong parameters: {e}"
except Exception as e:
return f"ERROR: Tool '{name}' failed: {e}"

:::tip Tool output quality determines agent success The quality of tool error messages directly determines how well the agent recovers from failures. "ERROR: old_str not found" is unhelpful. "ERROR: old_str not found. Similar lines found at lines 45, 87, 120. Use read_file to see the current content." lets the agent self-correct without human intervention. :::

:::danger The bash tool is maximally powerful With bash, the agent can delete files, install packages, make network requests, read environment variables (including secrets), and modify any part of the system. Always run coding agents in a container or VM with limited permissions. For automated pipelines, consider a whitelist of allowed commands rather than a blacklist. :::


Interview Q&A

Q: Why is the bash tool both the most useful and most dangerous tool in a coding agent's toolkit?

A: bash gives the agent full access to the operating system - it can run tests, grep for patterns, install packages, check syntax with linters, and perform virtually any task a developer could in a terminal. This makes it essential for a capable agent. The danger is the same reason: with bash, the agent can delete files, overwrite system configuration, make network requests, and read environment variables including API keys and passwords. The mitigation is sandboxing: run the agent in a Docker container with a read-only bind mount of anything sensitive, limited network access, and a non-root user. Block dangerous command patterns at the tool level as a second line of defense.

Q: Why must edit_file's old_str appear exactly once in the file?

A: If old_str appears multiple times, the edit is ambiguous - the agent may intend to change one occurrence but the tool changes a different one. This is a silent correctness failure: the tool "succeeds" but modifies the wrong code. Requiring exactly one occurrence forces the agent to include enough surrounding context to make the match unique. This is a deliberate UX design choice: it is better to fail loudly with a helpful error than to silently succeed incorrectly.

Q: How should tool error messages be designed to help the agent self-correct?

A: Good tool error messages should: (1) clearly state what went wrong; (2) suggest the most likely cause; (3) provide a concrete next step. For example, when edit_file fails because old_str is not found, the error should show similar lines from the file (so the agent can see what the actual content is), and recommend calling read_file to see the current state. This lets the agent self-correct without human intervention, which is critical for autonomous operation.

Q: What is the difference between write_file and edit_file, and when should each be used?

A: write_file replaces the entire file content. Use it only for creating new files - for existing files, it requires the agent to regenerate every line correctly, which is expensive and risky. edit_file makes a surgical change by finding an exact string and replacing it. Use this for modifying existing files - it changes only what needs to change and leaves everything else intact. The practical rule: if the file already exists, always use edit_file. If the file is new, use write_file.

Q: How do you implement sandboxing for the bash tool in a production coding agent?

A: Several layers: (1) Docker container - run the agent process in a container with a non-root user, limited capabilities, and controlled volume mounts; (2) Network isolation - use --network=none if the agent does not need internet access, or a custom Docker network with egress rules; (3) Filesystem limits - mount the repo directory as the only write target; (4) Command filtering - block obviously dangerous patterns (rm -rf /, fork bombs, raw device writes) at the tool layer; (5) Time limits - all commands have a timeout; (6) Output size limits - truncate very large outputs to prevent memory exhaustion. Defense in depth is the right approach: each layer catches what others miss.

© 2026 EngineersOfAI. All rights reserved.