Skip to main content

Agentic Code Editing

Writing new code is easy. You start with a blank file, make something up, and it works or it does not.

Editing existing code is different. You are a guest in a codebase that existed before you and will exist after you. The code has patterns, conventions, dependencies, and history. Touching it wrong does not just break the thing you changed - it can break the twenty things that depend on it.

This is where most coding agents struggle. LLMs are excellent at generating plausible-looking code. They are much less reliable at the specific, constrained task of modifying existing code with minimal disruption - reading what is already there, understanding why it is structured that way, making the smallest possible change that solves the problem, and leaving everything else alone.

This lesson explains how to build an agent that edits code correctly.


The Core Principle: Read Before You Write

The most important rule in agentic code editing is so simple it sounds obvious, yet so many implementations ignore it:

Never write to a file you have not read.

This is not a guideline. It is a hard rule. An agent that writes to a file it has not read is making one of several dangerous assumptions:

  1. That it already knows the file's content (it does not - the codebase may have changed)
  2. That writing a plausible-looking file is the same as making a correct edit (it is not)
  3. That the file structure it remembers from context is the current state (it may not be)

Every agentic code editing session follows this sequence:

1. Understand the task
2. Find the relevant files (grep, list directory, repo map)
3. Read the relevant files completely
4. Plan the minimal change
5. Execute the change
6. Verify (run tests)
7. Repeat if needed

Skip step 3 and everything after it becomes unreliable.


:::tip 🎮 Interactive Playground Visualize this concept: Try the Coding Agent Loop demo on the EngineersOfAI Playground - no code required. :::

Code Navigation: Finding the Right Files

Before you can read the right file, you need to find it. A coding agent uses several navigation strategies:

Strategy 1: Symbol Search with grep

The fastest way to find where a function, class, or variable is defined:

import subprocess
import re
from pathlib import Path
from dataclasses import dataclass
from typing import Optional


@dataclass
class SearchResult:
filepath: str
line_number: int
line_content: str
context_before: list[str]
context_after: list[str]


def search_codebase(
pattern: str,
root: str,
file_extensions: list[str] = None,
context_lines: int = 2,
max_results: int = 50,
) -> list[SearchResult]:
"""
Search for a pattern across a codebase using ripgrep (or grep fallback).
Returns results with context lines.
"""
if file_extensions is None:
file_extensions = [".py", ".ts", ".js", ".go", ".rs", ".java"]

# Build the ripgrep command
# -n: line numbers
# -A/-B: context lines after/before match
# --include: file type filter
includes = " ".join(f"--glob '*.{ext.lstrip('.')}'" for ext in file_extensions)
cmd = f"rg -n -A {context_lines} -B {context_lines} {includes} {repr(pattern)} {root}"

try:
result = subprocess.run(
cmd,
shell=True,
capture_output=True,
text=True,
timeout=30,
)
output = result.stdout
except (subprocess.TimeoutExpired, FileNotFoundError):
# Fallback to grep
cmd = f"grep -rn --include='*.py' -A {context_lines} -B {context_lines} {repr(pattern)} {root}"
result = subprocess.run(cmd, shell=True, capture_output=True, text=True, timeout=30)
output = result.stdout

return _parse_grep_output(output, max_results)


def _parse_grep_output(output: str, max_results: int) -> list[SearchResult]:
"""Parse grep/ripgrep output into SearchResult objects."""
results = []
current_file = None
current_match = None
context_before = []
context_after = []
in_after = False

for line in output.split("\n"):
# File separator (ripgrep uses --)
if line == "--":
if current_match:
results.append(current_match)
current_match = None
context_before = []
in_after = False
continue

# Match line: filename:lineno:content
# Context before: filename-lineno-content
# Context after: filename-lineno-content (same format as before)

match = re.match(r'^(.+?):([\d]+):(.*)$', line)
if match:
filepath, lineno, content = match.groups()
if current_match and int(lineno) > current_match.line_number:
# This is after-context
current_match.context_after.append(content)
else:
# This is a new match
if current_match:
results.append(current_match)
if len(results) >= max_results:
break
current_match = SearchResult(
filepath=filepath,
line_number=int(lineno),
line_content=content,
context_before=list(context_before),
context_after=[],
)
context_before = []

if current_match and len(results) < max_results:
results.append(current_match)

return results

Strategy 2: AST-Based Symbol Location

Grep finds text. AST parsing finds semantic structure - the actual function definition, class body, or import statement.

import ast
from pathlib import Path
from typing import Optional


@dataclass
class SymbolLocation:
filepath: str
symbol_name: str
symbol_type: str # "function", "class", "method", "variable"
start_line: int
end_line: int
parent_class: Optional[str]


def find_symbol_in_file(filepath: str, symbol_name: str) -> list[SymbolLocation]:
"""Find all definitions of a symbol in a Python file."""
try:
source = Path(filepath).read_text(encoding="utf-8", errors="replace")
tree = ast.parse(source)
except (SyntaxError, UnicodeDecodeError):
return []

locations = []

for node in ast.walk(tree):
if isinstance(node, (ast.FunctionDef, ast.AsyncFunctionDef)):
if node.name == symbol_name:
# Determine parent class
parent = _find_parent_class(tree, node)
locations.append(SymbolLocation(
filepath=filepath,
symbol_name=symbol_name,
symbol_type="method" if parent else "function",
start_line=node.lineno,
end_line=node.end_lineno or node.lineno,
parent_class=parent,
))
elif isinstance(node, ast.ClassDef):
if node.name == symbol_name:
locations.append(SymbolLocation(
filepath=filepath,
symbol_name=symbol_name,
symbol_type="class",
start_line=node.lineno,
end_line=node.end_lineno or node.lineno,
parent_class=None,
))

return locations


def _find_parent_class(tree: ast.AST, target_node: ast.AST) -> Optional[str]:
"""Find the class that contains a given node."""
for node in ast.walk(tree):
if isinstance(node, ast.ClassDef):
if target_node in ast.walk(node):
return node.name
return None


def extract_function_source(filepath: str, function_name: str, class_name: Optional[str] = None) -> Optional[str]:
"""Extract the source code of a specific function or method."""
locations = find_symbol_in_file(filepath, function_name)

for loc in locations:
if class_name and loc.parent_class != class_name:
continue
if not class_name and loc.parent_class is not None:
continue

lines = Path(filepath).read_text().splitlines()
return "\n".join(lines[loc.start_line - 1 : loc.end_line])

return None

Strategy 3: Import Graph Traversal

If the task involves a class named PaymentProcessor, the agent should look at which files import it - those are the files that use it and may need to change:

import ast
import os
from pathlib import Path


def find_importers(
symbol_name: str,
module_path: str,
root: str,
) -> list[str]:
"""
Find all Python files that import a specific symbol from a specific module.

symbol_name: e.g., "PaymentProcessor"
module_path: e.g., "src.payments.processor" (dot-separated)
root: repo root
"""
importers = []
module_basename = module_path.split(".")[-1]

for filepath in Path(root).rglob("*.py"):
try:
source = filepath.read_text(encoding="utf-8", errors="replace")
tree = ast.parse(source)
except (SyntaxError, UnicodeDecodeError):
continue

for node in ast.walk(tree):
# from src.payments.processor import PaymentProcessor
if isinstance(node, ast.ImportFrom):
if node.module and (node.module == module_path or node.module.endswith(module_basename)):
for alias in node.names:
name = alias.asname or alias.name
if name == symbol_name or alias.name == symbol_name:
importers.append(str(filepath))
break

# import src.payments.processor
elif isinstance(node, ast.Import):
for alias in node.names:
if alias.name == module_path:
importers.append(str(filepath))
break

return list(set(importers))

Edit Strategies: A Deep Comparison

Once you have found and read the relevant code, you need to modify it. There are four main edit strategies, each with distinct trade-offs.

Strategy 1: Whole-File Replacement

The agent reads the file, generates a new version, and writes the entire file.

Pros: Simple to implement. No matching issues.

Cons:

  • Token-expensive: must regenerate every line
  • High regression risk: small error anywhere corrupts the whole file
  • Loses comments and formatting that the agent "forgets"
  • Makes code review harder - every line shows as changed

When to use: Creating new files only. Never for editing existing files.

# Only appropriate for NEW files
def create_new_file(path: str, content: str) -> str:
"""Create a completely new file."""
p = Path(path)
if p.exists():
return f"ERROR: File already exists. Use edit_file to modify it."
p.parent.mkdir(parents=True, exist_ok=True)
p.write_text(content, encoding="utf-8")
return f"Created {path} ({len(content.splitlines())} lines)"

Strategy 2: Unified Diff / Patch

Generate a unified diff (the standard format used by git diff) and apply it with patch.

import difflib
import subprocess
import tempfile
import os


def apply_unified_diff(filepath: str, original: str, modified: str) -> str:
"""Generate and apply a unified diff."""
# Generate the diff
original_lines = original.splitlines(keepends=True)
modified_lines = modified.splitlines(keepends=True)

diff = list(difflib.unified_diff(
original_lines,
modified_lines,
fromfile=f"a/{filepath}",
tofile=f"b/{filepath}",
lineterm="",
))

if not diff:
return "No changes."

diff_text = "\n".join(diff)

# Apply with patch
with tempfile.NamedTemporaryFile(mode="w", suffix=".patch", delete=False) as f:
f.write(diff_text)
patch_file = f.name

try:
result = subprocess.run(
["patch", "-p0", filepath, patch_file],
capture_output=True,
text=True,
)
if result.returncode == 0:
return f"Patch applied: {len(diff)} lines in diff"
else:
return f"Patch failed: {result.stderr}"
finally:
os.unlink(patch_file)

Pros: Minimal token representation. Standard format.

Cons:

  • Fragile: patch fails if there are whitespace differences, line ending differences, or the file has changed since the diff was generated
  • Difficult for LLMs to generate correctly - LLMs confuse unified diff syntax
  • Error messages are cryptic

When to use: When you control both the reading and writing and can guarantee the diff is applied immediately. Not robust enough for most agent usage.

This is the approach used by Claude Code. The agent provides an exact string to find and a replacement string.

def edit_file_search_replace(
filepath: str,
old_str: str,
new_str: str,
allow_whitespace_mismatch: bool = False,
) -> str:
"""
Apply a surgical search-and-replace edit.

If allow_whitespace_mismatch is True, tries normalized matching
as a fallback (strips trailing whitespace from each line).
"""
p = Path(filepath)
if not p.exists():
return f"ERROR: File not found: {filepath}"

content = p.read_text(encoding="utf-8", errors="replace")

# Try exact match first
if old_str in content:
count = content.count(old_str)
if count > 1:
# Require uniqueness - ambiguous edits are dangerous
return (
f"ERROR: old_str appears {count} times in {filepath}. "
"Provide more surrounding context to make it unique."
)
new_content = content.replace(old_str, new_str, 1)
p.write_text(new_content, encoding="utf-8")
return f"Edit applied successfully to {filepath}."

# Fallback: normalized whitespace match
if allow_whitespace_mismatch:
def normalize(s: str) -> str:
return "\n".join(line.rstrip() for line in s.split("\n"))

normalized_content = normalize(content)
normalized_old = normalize(old_str)

if normalized_old in normalized_content:
# Find the actual string to replace by line matching
old_lines = old_str.split("\n")
content_lines = content.split("\n")

for start_idx in range(len(content_lines) - len(old_lines) + 1):
chunk = content_lines[start_idx : start_idx + len(old_lines)]
if [l.rstrip() for l in chunk] == [l.rstrip() for l in old_lines]:
actual_old = "\n".join(chunk)
new_content = content.replace(actual_old, new_str, 1)
p.write_text(new_content, encoding="utf-8")
return f"Edit applied (whitespace-normalized match) to {filepath}."

# Helpful error: find similar lines
first_line = old_str.split("\n")[0].strip()
similar = []
for i, line in enumerate(content.split("\n")):
if first_line[:30] in line:
similar.append(f" Line {i+1}: {line[:80]}")

hint = ""
if similar:
hint = "\nSimilar lines found:\n" + "\n".join(similar[:5])

return (
f"ERROR: old_str not found in {filepath}.{hint}\n"
"Make sure to use read_file first and copy the exact text including all whitespace."
)

Pros:

  • Surgical precision - changes only what you intend
  • Exact match requirement prevents accidental changes
  • Easy for LLMs to generate (just copy the current content)
  • Clear error messages

Cons:

  • Requires exact match - whitespace errors cause failures
  • Multiple occurrences require more context to disambiguate

When to use: Default strategy for modifying existing files. Use this unless you have a specific reason to use another approach.

Strategy 4: AST-Based Editing

Instead of string matching, parse the code into an AST, modify the AST, and unparse back to source. This is semantically robust - indentation and whitespace are irrelevant.

import ast
import textwrap
from typing import Callable


class FunctionReplacer(ast.NodeTransformer):
"""Replace a function definition in an AST."""

def __init__(self, target_name: str, new_source: str, target_class: Optional[str] = None):
self.target_name = target_name
self.new_source = new_source
self.target_class = target_class
self.replaced = False
self._in_target_class = False

def visit_ClassDef(self, node: ast.ClassDef) -> ast.AST:
if self.target_class and node.name == self.target_class:
self._in_target_class = True
result = self.generic_visit(node)
self._in_target_class = False
return result
return self.generic_visit(node)

def visit_FunctionDef(self, node: ast.FunctionDef) -> ast.AST:
if node.name != self.target_name:
return node
if self.target_class and not self._in_target_class:
return node
if not self.target_class and self._in_target_class:
return node

# Parse the new function
try:
new_tree = ast.parse(textwrap.dedent(self.new_source))
# Find the function node in the new tree
for child in ast.walk(new_tree):
if isinstance(child, ast.FunctionDef) and child.name == self.target_name:
self.replaced = True
# Preserve original line number for error messages
child.lineno = node.lineno
return child
except SyntaxError as e:
raise ValueError(f"New source has syntax error: {e}")

return node

visit_AsyncFunctionDef = visit_FunctionDef


def replace_function_ast(
filepath: str,
function_name: str,
new_function_source: str,
class_name: Optional[str] = None,
) -> str:
"""
Replace a function definition using AST manipulation.

This is whitespace-independent - it finds the function semantically.
The trade-off: it may alter formatting of the surrounding code.
"""
p = Path(filepath)
source = p.read_text(encoding="utf-8")

try:
tree = ast.parse(source)
except SyntaxError as e:
return f"ERROR: File has syntax error: {e}"

replacer = FunctionReplacer(
target_name=function_name,
new_source=new_function_source,
target_class=class_name,
)
new_tree = replacer.visit(tree)

if not replacer.replaced:
return f"ERROR: Function '{function_name}' not found in {filepath}"

# Unparse back to source
try:
new_source = ast.unparse(new_tree)
p.write_text(new_source, encoding="utf-8")
return f"Replaced function '{function_name}' in {filepath} via AST."
except Exception as e:
return f"ERROR unparsing modified AST: {e}"

Pros: Immune to whitespace and formatting differences. Semantically precise.

Cons:

  • ast.unparse() produces valid but reformatted code - loses original formatting
  • Does not work for non-Python files
  • Complex to implement correctly for all edge cases
  • May introduce style inconsistencies

When to use: When you need to replace a known function body and exact whitespace matching is unreliable (e.g., generated code, cross-platform files with mixed line endings).


Respecting Code Style

An agent that modifies a Python file should follow the existing style. A change that introduces 4-space indentation into a 2-space codebase, or adds trailing semicolons to a file without them, will fail code review.

import re
from pathlib import Path


def detect_code_style(filepath: str) -> dict:
"""
Detect coding style conventions in a file.

Returns a dict with detected style properties.
"""
content = Path(filepath).read_text(encoding="utf-8", errors="replace")
lines = content.split("\n")

style = {}

# Detect indentation
indent_counts = {"2": 0, "4": 0, "tabs": 0}
for line in lines:
if line.startswith("\t"):
indent_counts["tabs"] += 1
elif line.startswith(" ") and not line.startswith(" "):
indent_counts["2"] += 1
elif line.startswith(" "):
indent_counts["4"] += 1

if indent_counts["tabs"] > max(indent_counts["2"], indent_counts["4"]):
style["indent"] = "\t"
style["indent_style"] = "tabs"
elif indent_counts["2"] > indent_counts["4"]:
style["indent"] = " "
style["indent_style"] = "2 spaces"
else:
style["indent"] = " "
style["indent_style"] = "4 spaces"

# Detect line endings
if "\r\n" in content:
style["line_ending"] = "\r\n"
else:
style["line_ending"] = "\n"

# Detect quote style (Python)
double_quotes = len(re.findall(r'"[^"]*"', content))
single_quotes = len(re.findall(r"'[^']*'", content))
style["quotes"] = "double" if double_quotes > single_quotes else "single"

# Detect trailing newline
style["trailing_newline"] = content.endswith("\n")

# Detect max line length
non_empty = [len(l) for l in lines if l.strip()]
style["avg_line_length"] = sum(non_empty) / len(non_empty) if non_empty else 80
style["max_line_length"] = max(non_empty) if non_empty else 88

return style


def check_style_consistency(original_file: str, edited_content: str) -> list[str]:
"""
Check whether edited content follows the original file's style.

Returns a list of style violation warnings.
"""
style = detect_code_style(original_file)
warnings = []

for i, line in enumerate(edited_content.split("\n"), 1):
# Check indentation
if style["indent_style"] == "tabs" and re.match(r"^ ", line):
warnings.append(f"Line {i}: spaces used instead of tabs")
elif style["indent_style"] in ("2 spaces", "4 spaces"):
if re.match(r"^\t", line):
warnings.append(f"Line {i}: tabs used instead of spaces")

return warnings

Minimal Diffs: Change as Little as Possible

The principle of minimal diffs is important both for correctness and for developer trust.

An agent that makes 50-line changes when a 3-line change is sufficient:

  • Is harder to review
  • Has a higher chance of introducing regressions
  • Suggests the agent does not truly understand what it is doing

Here is how to enforce minimal diffs in your agent's system prompt:

When making edits:
- Change ONLY the lines that need to change
- Do not reformat, reorder, or "improve" surrounding code
- Do not add comments unless specifically asked
- Do not rename variables unless the rename is the task
- Your edit should be the minimal change that makes the tests pass

And here is a utility to measure diff size for monitoring:

import difflib


def compute_diff_stats(original: str, modified: str) -> dict:
"""Compute statistics about a diff."""
original_lines = original.splitlines()
modified_lines = modified.splitlines()

opcodes = difflib.SequenceMatcher(None, original_lines, modified_lines).get_opcodes()

lines_added = 0
lines_removed = 0
lines_changed = 0

for tag, i1, i2, j1, j2 in opcodes:
if tag == "insert":
lines_added += j2 - j1
elif tag == "delete":
lines_removed += i2 - i1
elif tag == "replace":
lines_changed += max(i2 - i1, j2 - j1)

total_original = len(original_lines)
churn_rate = (lines_added + lines_removed + lines_changed) / max(total_original, 1)

return {
"lines_added": lines_added,
"lines_removed": lines_removed,
"lines_changed": lines_changed,
"total_original": total_original,
"churn_rate": churn_rate, # fraction of file changed
"is_minimal": churn_rate < 0.15, # less than 15% of file changed
}

Multi-File Edits: Atomic Coordination

Some tasks require changes across multiple files simultaneously. Changing a function signature requires updating every call site. Adding a parameter requires updating the function definition, the tests, and potentially the documentation.

The agent must coordinate these changes atomically - if it changes the function signature but fails to update the call sites, the codebase will be broken.

from dataclasses import dataclass
from typing import Optional
import subprocess
from pathlib import Path


@dataclass
class PlannedEdit:
filepath: str
old_str: str
new_str: str
description: str
is_required: bool = True # if False, this is an optional improvement


def execute_multi_file_edit(
edits: list[PlannedEdit],
repo_path: str,
verify_command: Optional[str] = None,
) -> dict:
"""
Execute a set of coordinated edits across multiple files.

If any required edit fails, rolls back all completed edits.
"""
applied = []
original_contents = {}

# Save original content for rollback
for edit in edits:
p = Path(edit.filepath)
if p.exists():
original_contents[edit.filepath] = p.read_text(encoding="utf-8")

try:
for edit in edits:
result = edit_file_search_replace(
filepath=edit.filepath,
old_str=edit.old_str,
new_str=edit.new_str,
)

if result.startswith("ERROR") and edit.is_required:
# Required edit failed - rollback everything
print(f"Required edit failed: {edit.description}")
print(f"Error: {result}")
_rollback(applied, original_contents)
return {
"success": False,
"failed_edit": edit.description,
"error": result,
"rolled_back": len(applied),
}

if not result.startswith("ERROR"):
applied.append(edit.filepath)
print(f"Applied: {edit.description}")

# All edits applied - verify if command provided
if verify_command:
verify_result = subprocess.run(
verify_command,
shell=True,
cwd=repo_path,
capture_output=True,
text=True,
timeout=60,
)
if verify_result.returncode != 0:
print(f"Verification failed: {verify_result.stdout}")
_rollback(applied, original_contents)
return {
"success": False,
"error": "Verification failed",
"test_output": verify_result.stdout + verify_result.stderr,
"rolled_back": len(applied),
}

return {
"success": True,
"edits_applied": len(applied),
"files_modified": applied,
}

except Exception as e:
_rollback(applied, original_contents)
return {"success": False, "error": str(e), "rolled_back": len(applied)}


def _rollback(applied_files: list[str], original_contents: dict):
"""Restore original file contents."""
for filepath in applied_files:
if filepath in original_contents:
Path(filepath).write_text(original_contents[filepath], encoding="utf-8")
print(f"Rolled back: {filepath}")

The Edit-Verify-Backtrack Loop

The complete editing workflow with backtracking when tests fail:


:::warning Uniqueness requirement for search-replace The old_str must appear exactly once in the file. If the same pattern appears multiple times, the agent must provide more surrounding context to make the match unique. A function called process() that appears 3 times in a file is ambiguous - include the class name or surrounding lines to pin down which one. :::

:::danger Never truncate output in read_file A common agent bug is truncating long files - only reading the first 500 lines to "save tokens." This leads to edits based on incomplete information. If a function is on line 800 of a 1200-line file, the agent must read all 1200 lines or at least read the relevant section. Truncating without knowing where the relevant code is means potentially missing it. :::


Interview Q&A

Q: Why is the search-and-replace edit strategy preferred over unified diff for coding agents?

A: Unified diff is fragile because it relies on exact line numbers and context. If the file has changed since the diff was generated (even by one line), the patch fails. LLMs also struggle to generate valid unified diff syntax consistently. Search-replace works by finding an exact string and replacing it - the agent just copies the existing code (which it read), writes the replacement, and the operation is unambiguous. The main requirement is uniqueness: the old string must appear exactly once, which the agent ensures by including sufficient surrounding context.

Q: What does "minimal diff" mean and why does it matter for coding agents?

A: A minimal diff changes only the lines strictly necessary to fix the problem, leaving all surrounding code untouched. It matters for several reasons: (1) correctness - the fewer lines changed, the lower the risk of accidental regressions; (2) reviewability - developers can quickly verify a small change but struggle with large diffs; (3) signal quality - a minimal diff demonstrates that the agent understood the problem precisely rather than taking a shotgun approach; (4) test sensitivity - tests that pass on minimal changes may fail on large refactors of the same code.

Q: How do coding agents navigate large codebases to find the relevant files?

A: Three complementary strategies: (1) Repo map - a compact index of all files with their class and function signatures, giving the LLM a navigational overview without full file content; (2) Symbol search - using grep or ripgrep to find where a specific function, class, or string appears in the codebase; (3) AST parsing - when you know what symbol you are looking for, AST parsing finds it semantically (immune to naming variations and comments). The agent typically starts with the repo map to orient itself, then uses grep to narrow to specific files, then reads those files completely before making any edits.

Q: What are the trade-offs between AST-based editing and string-based search-replace?

A: Search-replace is simpler, language-agnostic (works for Python, TypeScript, Go, any text), and preserves the original formatting exactly. The downside is it requires exact string matching - whitespace differences cause failures. AST-based editing finds code semantically regardless of formatting, is immune to whitespace issues, and can make structural changes that are impossible with string matching (like moving a method from one class to another). The downside is it requires a language-specific parser, ast.unparse() reformats the code (potentially inconsistently with the rest of the file), and it only works for languages with good AST tooling in Python.

Q: How should a coding agent handle multi-file edits that need to be atomic?

A: The agent should plan all edits before executing any of them, then execute them in a logical order (typically: interface/signature first, then implementations, then tests). It should save the original content of each file before editing, so it can roll back if a later edit fails. After all edits are applied, it runs the test suite to verify the complete change is consistent. If tests fail, it may need to roll back to the original state and try a different approach. The key insight is that a partially-applied multi-file edit is often worse than no edit at all - a broken interface with some call sites updated and others not is harder to debug than the original bug.

© 2026 EngineersOfAI. All rights reserved.