Skip to main content

Python CLI Design Principles Practice Problems & Exercises

Practice: CLI Design Principles

11 problems4 Easy4 Medium3 Hard40-55 min
← Back to lesson

Easy

#1sys.argv Basics: Counting ArgumentsEasy
sys.argvargument-parsingbasics

Predict the output. Understand how sys.argv is structured when a script is called with arguments.

Python
import sys

# Simulate: python script.py foo bar
sys.argv = ["script.py", "foo", "bar"]

print(len(sys.argv))
for arg in sys.argv:
    print(arg)
Solution
import sys

sys.argv = ["script.py", "foo", "bar"]

print(len(sys.argv))
for arg in sys.argv:
print(arg)

Output:

3
script.py
foo
bar

How it works: sys.argv is a list of strings representing the command-line arguments. sys.argv[0] is always the script name itself — here "script.py". sys.argv[1] and sys.argv[2] are the user-supplied arguments "foo" and "bar". The total length is 3 (script name plus 2 arguments).

Key insight: sys.argv is the raw, unprocessed interface to command-line arguments. Every element is a string — even numbers. If your script needs to do arithmetic on a CLI argument like "42", you must cast it: int(sys.argv[1]). This is why argparse was invented — it handles type conversion and validation automatically, and it is the standard library approach for any CLI beyond a trivial script.

Expected Output
3\nscript.py\nfoo\nbar
Hints

Hint 1: sys.argv[0] is always the name of the script itself.

Hint 2: sys.argv[1:] contains the actual arguments passed by the user.

Hint 3: len(sys.argv) includes the script name at index 0.

#2Exit Code SemanticsEasy
sys.exitexit-codesconventions

Predict the output. Map the three standard CLI exit codes to their meanings using SystemExit.

Python
import sys

def get_exit_code(scenario):
    try:
        if scenario == "success":
            sys.exit(0)
        elif scenario == "runtime_error":
            sys.exit(1)
        elif scenario == "bad_args":
            sys.exit(2)
    except SystemExit as e:
        return e.code

print(get_exit_code("success"))
print(get_exit_code("runtime_error"))
print(get_exit_code("bad_args"))
Solution
import sys

def get_exit_code(scenario):
try:
if scenario == "success":
sys.exit(0)
elif scenario == "runtime_error":
sys.exit(1)
elif scenario == "bad_args":
sys.exit(2)
except SystemExit as e:
return e.code

print(get_exit_code("success"))
print(get_exit_code("runtime_error"))
print(get_exit_code("bad_args"))

Output:

0
1
2

How it works: sys.exit(n) raises SystemExit(n). Since SystemExit is an exception, it can be caught — which is what Python's runtime does to terminate the process. The .code attribute holds the integer you passed. Exit code 0 means success; 1 means a runtime error occurred (file not found, processing failed, etc.); 2 means the user called the program incorrectly (wrong arguments, missing flags).

Key insight: Exit codes are the silent API between your CLI and the shell. A CI system like GitHub Actions marks a step as failed when the process exits with a non-zero code. A shell pipeline using && stops executing if any command exits non-zero. A tool that always exits 0 — even when it fails — will silently corrupt automated pipelines and pass CI checks when it should fail them. Exit codes are non-negotiable.

Expected Output
0\n1\n2
Hints

Hint 1: sys.exit(0) signals success. Any non-zero exit code signals failure.

Hint 2: Exit code 1 is a general runtime error. Exit code 2 is a usage/argument error.

Hint 3: sys.exit() raises SystemExit, which is why try/except can catch it.

#3argparse: Basic Positional and Optional ArgumentsEasy
argparseadd_argumentparse_args

Predict the output. Parse a set of arguments with argparse and inspect the resulting Namespace.

Python
import argparse

parser = argparse.ArgumentParser()
parser.add_argument("input", help="Input file path")
parser.add_argument("--dry-run", action="store_true", help="Preview without changes")
parser.add_argument("-v", "--verbose", action="count", default=0)

args = parser.parse_args(["data.csv", "--dry-run"])

print(args.input)
print(args.dry_run)
print(args.verbose)
Solution
import argparse

parser = argparse.ArgumentParser()
parser.add_argument("input", help="Input file path")
parser.add_argument("--dry-run", action="store_true", help="Preview without changes")
parser.add_argument("-v", "--verbose", action="count", default=0)

args = parser.parse_args(["data.csv", "--dry-run"])

print(args.input)
print(args.dry_run)
print(args.verbose)

Output:

data.csv
True
0

How it works: parse_args(["data.csv", "--dry-run"]) parses the list as if those were the command-line arguments. The positional "input" captures "data.csv" and stores it on args.input. --dry-run with action="store_true" sets args.dry_run = True because the flag is present. -v / --verbose with action="count" starts at default=0 and increments each time -v appears — since it is absent here, args.verbose is 0.

Key insight: argparse converts dashes in long option names to underscores when creating the Namespace attribute: --dry-run becomes args.dry_run. This is automatic and consistent. Always use action="count" rather than action="store_true" for verbosity flags — it supports -v (1), -vv (2), -vvv (3), which lets tools print progressively more detail without multiple separate flags.

Expected Output
data.csv\nTrue\n0
Hints

Hint 1: Positional arguments have no leading dash. Optional arguments start with -- or -.

Hint 2: action="store_true" sets the value to True when the flag is present, False otherwise.

Hint 3: parse_args([...]) accepts a list of strings so you can test without touching sys.argv.

#4stderr vs stdout: Diagnostic SeparationEasy
stderrstdoutsys.stderrstream-separation

Demonstrate the separation of stdout and stderr. Show that diagnostic messages go to stderr while output data goes to stdout.

Python
import sys
import io

# Capture both streams for demonstration
out_buf = io.StringIO()
err_buf = io.StringIO()

def process_file(filename, row_count, out=None, err=None):
    out = out or sys.stdout
    err = err or sys.stderr
    # Diagnostic goes to stderr
    print(f"[INFO] reading {filename}", file=err)
    # Output data goes to stdout
    print(f"processed {row_count} rows", file=out)

process_file("input.csv", 42, out=out_buf, err=err_buf)

print("STDOUT:", out_buf.getvalue().strip())
print("STDERR:", err_buf.getvalue().strip())
Solution
import sys
import io

out_buf = io.StringIO()
err_buf = io.StringIO()

def process_file(filename, row_count, out=None, err=None):
out = out or sys.stdout
err = err or sys.stderr
print(f"[INFO] reading {filename}", file=err)
print(f"processed {row_count} rows", file=out)

process_file("input.csv", 42, out=out_buf, err=err_buf)

print("STDOUT:", out_buf.getvalue().strip())
print("STDERR:", err_buf.getvalue().strip())

Output:

STDOUT: processed 42 rows
STDERR: [INFO] reading input.csv

How it works: print(..., file=sys.stderr) routes the message to the error stream. print(..., file=sys.stdout) or plain print(...) routes to stdout. When a user pipes output — process input.csv | grep error — only stdout is piped. The stderr diagnostic [INFO] reading input.csv appears in the terminal directly without contaminating the downstream grep. The injectable out and err parameters make this function testable without redirecting the actual system streams.

Key insight: Never mix diagnostic messages into stdout. A single stray print("Processing...") in a pipeline tool can silently corrupt JSON output, break CSV headers, or inject garbage into the downstream command's stdin. All progress indicators, log lines, warnings, and status messages belong on stderr. The rule is simple: stdout is for data, stderr is for everything else.

Expected Output
STDOUT: processed 42 rows\nSTDERR: [INFO] reading input.csv
Hints

Hint 1: print(..., file=sys.stderr) writes to the error stream, not stdout.

Hint 2: Diagnostic messages (progress, logs, warnings) belong on stderr so they do not corrupt piped output.

Hint 3: Capture both streams separately using io.StringIO to test them independently.


Medium

#5argparse: Custom Type ValidationMedium
argparsetypeArgumentTypeErrorvalidation

Build a custom type validator for a --port argument and demonstrate both valid and invalid input handling.

Python
import argparse

def port_number(value):
    """Argparse type that requires a valid port number."""
    try:
        port = int(value)
    except ValueError:
        raise argparse.ArgumentTypeError(
            f"Not a valid port number: {value}"
        )
    if not 1 <= port <= 65535:
        raise argparse.ArgumentTypeError(
            f"Port must be 1-65535, got {port}"
        )
    return port

parser = argparse.ArgumentParser()
parser.add_argument("--port", type=port_number, default=8080)

# Valid port
args = parser.parse_args(["--port", "8080"])
print(args.port)

# Invalid: out of range
try:
    port_number("99999")
except argparse.ArgumentTypeError as e:
    print(f"ArgumentTypeError: {e}")

# Invalid: not a number
try:
    port_number("abc")
except argparse.ArgumentTypeError as e:
    print(f"ArgumentTypeError: {e}")
Solution
import argparse

def port_number(value):
try:
port = int(value)
except ValueError:
raise argparse.ArgumentTypeError(
f"Not a valid port number: {value}"
)
if not 1 <= port <= 65535:
raise argparse.ArgumentTypeError(
f"Port must be 1-65535, got {port}"
)
return port

parser = argparse.ArgumentParser()
parser.add_argument("--port", type=port_number, default=8080)

args = parser.parse_args(["--port", "8080"])
print(args.port)

try:
port_number("99999")
except argparse.ArgumentTypeError as e:
print(f"ArgumentTypeError: {e}")

try:
port_number("abc")
except argparse.ArgumentTypeError as e:
print(f"ArgumentTypeError: {e}")

Output:

8080
ArgumentTypeError: Port must be 1-65535, got 99999
ArgumentTypeError: Not a valid port number: abc

How it works: The type=port_number parameter tells argparse to call port_number(value) on the raw string before storing the result. If port_number raises argparse.ArgumentTypeError, argparse formats it as "argument --port: invalid port_number value: '99999'" and calls sys.exit(2). The function only runs during argument parsing — by the time your main() code runs, args.port is already a validated integer.

Key insight: Use custom type functions instead of validating inside main(). This approach produces clean error messages at argument parse time, consistent with argparse's built-in error formatting, and keeps your business logic free of argument validation code. A common pattern is to define validators like existing_file, positive_int, and url in a shared cli_types.py module and import them across multiple argument parsers.

Expected Output
8080\nArgumentTypeError: Port must be 1-65535, got 99999\nArgumentTypeError: Not a valid port number: abc
Hints

Hint 1: argparse type functions are called with the raw string value before parse_args returns.

Hint 2: Raise argparse.ArgumentTypeError (not ValueError) to get clean error messages.

Hint 3: The error message from ArgumentTypeError appears after "argument --port: invalid port_number value:".

#6argparse: Subcommands with set_defaultsMedium
argparseadd_subparserssubcommandsset_defaults

Build a subcommand CLI with process and validate subcommands. Use set_defaults(func=...) for dispatch.

Python
import argparse
from pathlib import Path

def cmd_validate(args):
    print(f"Validating: {args.input}")

def cmd_process(args):
    output = args.output or "stdout"
    print(f"Processing: {args.input} -> {output}")

def build_parser():
    parser = argparse.ArgumentParser(prog="myapp")
    subs = parser.add_subparsers(dest="command")
    subs.required = True

    val = subs.add_parser("validate", help="Validate a file")
    val.add_argument("input", type=str)
    val.set_defaults(func=cmd_validate)

    proc = subs.add_parser("process", help="Process a file")
    proc.add_argument("input", type=str)
    proc.add_argument("-o", "--output", type=str, default=None)
    proc.set_defaults(func=cmd_process)

    return parser

parser = build_parser()

args1 = parser.parse_args(["validate", "report.csv"])
print(args1.command)
args1.func(args1)

args2 = parser.parse_args(["process", "data.csv", "-o", "output.json"])
print(args2.command)
args2.func(args2)
Solution
import argparse

def cmd_validate(args):
print(f"Validating: {args.input}")

def cmd_process(args):
output = args.output or "stdout"
print(f"Processing: {args.input} -> {output}")

def build_parser():
parser = argparse.ArgumentParser(prog="myapp")
subs = parser.add_subparsers(dest="command")
subs.required = True

val = subs.add_parser("validate", help="Validate a file")
val.add_argument("input", type=str)
val.set_defaults(func=cmd_validate)

proc = subs.add_parser("process", help="Process a file")
proc.add_argument("input", type=str)
proc.add_argument("-o", "--output", type=str, default=None)
proc.set_defaults(func=cmd_process)

return parser

parser = build_parser()

args1 = parser.parse_args(["validate", "report.csv"])
print(args1.command)
args1.func(args1)

args2 = parser.parse_args(["process", "data.csv", "-o", "output.json"])
print(args2.command)
args2.func(args2)

Output:

validate
Validating: report.csv
process
Processing: data.csv -> output.json

How it works: add_subparsers(dest="command") creates a positional argument that captures which subcommand was used — here stored as args.command. Each subparser (val, proc) defines its own arguments independently. set_defaults(func=cmd_validate) attaches the handler function directly to the namespace so that main() can call args.func(args) without any if args.command == "validate" dispatch logic.

Key insight: The args.func(args) dispatch pattern is idiomatic Python for subcommand CLIs. It avoids brittle if/elif chains that need to be updated every time you add a subcommand. Setting subs.required = True ensures that calling myapp with no subcommand exits with a usage error rather than silently doing nothing. This is the pattern used by tools like git, kubectl, and docker — each subcommand is effectively a separate mini-CLI that shares a common entry point.

Expected Output
validate\nValidating: report.csv\nprocess\nProcessing: data.csv -> output.json
Hints

Hint 1: add_subparsers() creates a special argument that captures the subcommand name via dest.

Hint 2: set_defaults(func=handler) attaches a callable to each subparser for dispatch.

Hint 3: Call args.func(args) in main() to dispatch to the correct handler without if/elif chains.

#7stdin Detection: File or PipeMedium
sys.stdinisattystdinpiping

Implement an input router that reads from a file argument if provided, falls back to stdin if piped, and exits with code 2 if neither is available.

Python
import sys
import io

def count_lines(text):
    return len(text.strip().splitlines())

def read_input(filepath, stdin_stream):
    """Read from filepath, or from stdin_stream if it's a pipe."""
    if filepath is not None:
        return filepath
    if not stdin_stream.isatty():
        return stdin_stream.read()
    return None

# Scenario 1: file argument provided
file_data = "line1\nline2\nline3\nline4\nline5\nline6\nline7\nline8\nline9\nline10"
fake_stdin_tty = type('FakeTTY', (), {'isatty': lambda self: True, 'read': lambda self: ''})()
result = read_input(file_data, fake_stdin_tty)
print(f"file mode: {count_lines(result)} rows")

# Scenario 2: stdin is a pipe (isatty returns False)
pipe_data = "a\nb\nc\nd\ne"
fake_stdin_pipe = io.StringIO(pipe_data)
fake_stdin_pipe.isatty = lambda: False
result = read_input(None, fake_stdin_pipe)
print(f"pipe mode: {count_lines(result)} rows")

# Scenario 3: no file and stdin is a terminal (interactive, no data)
fake_stdin_tty2 = type('FakeTTY', (), {'isatty': lambda self: True, 'read': lambda self: ''})()
result = read_input(None, fake_stdin_tty2)
if result is None:
    print("no input: exit code 2")
Solution
import sys
import io

def count_lines(text):
return len(text.strip().splitlines())

def read_input(filepath, stdin_stream):
if filepath is not None:
return filepath
if not stdin_stream.isatty():
return stdin_stream.read()
return None

file_data = "line1\nline2\nline3\nline4\nline5\nline6\nline7\nline8\nline9\nline10"
fake_stdin_tty = type('FakeTTY', (), {'isatty': lambda self: True, 'read': lambda self: ''})()
result = read_input(file_data, fake_stdin_tty)
print(f"file mode: {count_lines(result)} rows")

pipe_data = "a\nb\nc\nd\ne"
fake_stdin_pipe = io.StringIO(pipe_data)
fake_stdin_pipe.isatty = lambda: False
result = read_input(None, fake_stdin_pipe)
print(f"pipe mode: {count_lines(result)} rows")

fake_stdin_tty2 = type('FakeTTY', (), {'isatty': lambda self: True, 'read': lambda self: ''})()
result = read_input(None, fake_stdin_tty2)
if result is None:
print("no input: exit code 2")

Output:

file mode: 10 rows
pipe mode: 5 rows
no input: exit code 2

How it works: stdin_stream.isatty() returns True when stdin is connected to an interactive terminal — meaning the user is typing directly, not piping data. When isatty() is False, stdin is receiving data from a pipe (cat file | myapp) or a redirect (myapp < file). The function checks the file argument first, falls back to stdin if it is a pipe, and returns None if neither is available — which signals the caller to exit with code 2.

Key insight: The isatty() check is the standard Unix idiom for "is this a pipe or a terminal?". The classic Unix tool pattern is: if a file argument is given, process it; otherwise read from stdin. This makes tools composable with the entire Unix ecosystem without any special flags. cat, grep, sort, wc, and awk all follow this convention. A CLI tool that forces users to explicitly pass --stdin misses this composability entirely.

Expected Output
file mode: 10 rows\npipe mode: 5 rows\nno input: exit code 2
Hints

Hint 1: sys.stdin.isatty() returns True when stdin is connected to a terminal (interactive), False when piped.

Hint 2: When isatty() is False, stdin has data from a pipe or redirect.

Hint 3: A tool that supports both file and stdin arguments is composable with cat, curl, and other Unix tools.

#8The --dry-run Flag: Preview Before MutationMedium
dry-runargparsedestructive-operationspreview

Implement a batch renamer with a --dry-run flag that previews the renames without making any changes.

Python
import argparse

def rename_files(pairs, dry_run):
    """Rename (src, dst) pairs. With dry_run=True, only print what would happen."""
    for src, dst in pairs:
        if dry_run:
            print(f"[DRY RUN] would rename: {src} -> {dst}")
        else:
            print(f"Renamed: {src} -> {dst}")

def build_parser():
    parser = argparse.ArgumentParser(description="Batch rename files")
    parser.add_argument(
        "--dry-run",
        action="store_true",
        help="Preview renames without making changes",
    )
    return parser

pairs = [("old.csv", "new.csv"), ("data.csv", "data_backup.csv")]

# Dry run — preview only
args_dry = build_parser().parse_args(["--dry-run"])
rename_files(pairs, args_dry.dry_run)

# Real run — execute
args_real = build_parser().parse_args([])
rename_files(pairs, args_real.dry_run)
Solution
import argparse

def rename_files(pairs, dry_run):
for src, dst in pairs:
if dry_run:
print(f"[DRY RUN] would rename: {src} -> {dst}")
else:
print(f"Renamed: {src} -> {dst}")

def build_parser():
parser = argparse.ArgumentParser(description="Batch rename files")
parser.add_argument(
"--dry-run",
action="store_true",
help="Preview renames without making changes",
)
return parser

pairs = [("old.csv", "new.csv"), ("data.csv", "data_backup.csv")]

args_dry = build_parser().parse_args(["--dry-run"])
rename_files(pairs, args_dry.dry_run)

args_real = build_parser().parse_args([])
rename_files(pairs, args_real.dry_run)

Output:

[DRY RUN] would rename: old.csv -> new.csv
[DRY RUN] would rename: data.csv -> data_backup.csv
Renamed: old.csv -> new.csv
Renamed: data.csv -> data_backup.csv

How it works: action="store_true" sets args.dry_run = True when --dry-run is passed, and False by default. The rename_files function follows the same iteration logic in both modes — only the final action changes. This ensures the dry-run output faithfully represents what the real run will do. Without dry-run, args.dry_run is False and the function executes normally.

Key insight: --dry-run is one of the most important flags for any command that mutates files, sends requests, or changes databases. It lets engineers verify what will happen before committing to irreversible changes. The implementation rule is: dry-run and real execution must traverse the same code path. A dry-run that skips validation or pre-flight checks gives false confidence. Write the loop once, branch only at the mutation point.

Expected Output
[DRY RUN] would rename: old.csv -> new.csv\n[DRY RUN] would rename: data.csv -> data_backup.csv\nRenamed: old.csv -> new.csv\nRenamed: data.csv -> data_backup.csv
Hints

Hint 1: --dry-run should print what WOULD happen without actually doing it.

Hint 2: The dry-run path and the real execution path should follow the same logic — only the final action differs.

Hint 3: Use action="store_true" for boolean flags like --dry-run.


Hard

#9argparse choices and nargs: Multi-Value ArgumentsHard
argparsechoicesnargsmultiple-values

Predict the output of an argparse configuration that uses choices to restrict a format option and nargs="+" to accept multiple column names.

Python
import argparse

def build_parser():
    parser = argparse.ArgumentParser()
    parser.add_argument(
        "--format",
        choices=["csv", "json", "tsv"],
        default="csv",
    )
    parser.add_argument(
        "--columns",
        nargs="+",
        metavar="COL",
        default=None,
    )
    return parser

parser = build_parser()

# Case 1: explicit format and multiple columns
args = parser.parse_args(["--format", "json", "--columns", "name", "age", "email"])
print(args.format)
print(args.columns)

# Case 2: default format, no columns
args2 = parser.parse_args([])
print(args2.format)
print(args2.columns)

# Case 3: inspect the valid choices
print(parser._option_string_actions["--format"].choices)
Solution
import argparse

def build_parser():
parser = argparse.ArgumentParser()
parser.add_argument(
"--format",
choices=["csv", "json", "tsv"],
default="csv",
)
parser.add_argument(
"--columns",
nargs="+",
metavar="COL",
default=None,
)
return parser

parser = build_parser()

args = parser.parse_args(["--format", "json", "--columns", "name", "age", "email"])
print(args.format)
print(args.columns)

args2 = parser.parse_args([])
print(args2.format)
print(args2.columns)

print(parser._option_string_actions["--format"].choices)

Output:

json
['name', 'age', 'email']
csv
None
['csv', 'json', 'tsv']

How it works:

  1. --format json is valid because "json" is in the choices list. args.format is the string "json". If you passed --format xml, argparse would print "argument --format: invalid choice: 'xml'" and exit with code 2.

  2. --columns name age email uses nargs="+" which consumes all following non-flag tokens into a list. args.columns is ['name', 'age', 'email'].

  3. With no arguments, args.format falls back to the default="csv" and args.columns is None (no list created).

  4. The choices list is stored on the action object and is also displayed in the --help output.

Key insight: choices performs both validation and self-documentation — the valid values appear in the --help output automatically as {csv,json,tsv}. For nargs, the difference between "+" (one or more), "*" (zero or more), "?" (zero or one), and an integer is critical. nargs="+" is the most common for column-name-style arguments because requiring at least one value avoids the confusing case of --columns with no values following it.

Expected Output
json\n['name', 'age', 'email']\ncsv\nNone\n['csv', 'json', 'tsv']
Hints

Hint 1: choices= restricts values to a list; argparse produces an error for any value not in the list.

Hint 2: nargs="+" requires at least one value. nargs="*" allows zero or more.

Hint 3: When nargs="+" is used, the parsed result is always a list — even if only one value is given.

#10Complete CLI: parser.error and Validated File ArgumentsHard
argparseparser.errortype-validationexit-codesArgumentTypeError

Build a validated CLI parser that uses a custom file-existence type, choices, and parser.error(). Demonstrate both successful parsing and two error paths.

Python
import argparse
import sys
from pathlib import Path
import tempfile, os

def existing_file(value):
    p = Path(value)
    if not p.is_file():
        raise argparse.ArgumentTypeError(f"File not found: {value}")
    return p

def build_parser():
    parser = argparse.ArgumentParser(prog="convert")
    parser.add_argument("input", type=existing_file)
    parser.add_argument("-o", "--output", type=Path, required=True)
    parser.add_argument(
        "--format",
        choices=["csv", "json", "tsv"],
        default="json",
    )
    return parser

# Success: use a real temporary file
with tempfile.NamedTemporaryFile(suffix=".csv", delete=False) as f:
    tmp = f.name

try:
    parser = build_parser()
    args = parser.parse_args([tmp, "-o", "/tmp/out.json", "--format", "json"])
    print(args.input.name)
    print(str(args.output))
    print(args.format)
finally:
    os.unlink(tmp)

# Error 1: file does not exist
try:
    build_parser().parse_args(["missing.csv", "-o", "/tmp/x.json"])
except SystemExit as e:
    print(f"UsageError({e.code}): argument input: File not found: missing.csv")

# Error 2: invalid format choice
try:
    with tempfile.NamedTemporaryFile(suffix=".csv", delete=False) as f:
        tmp2 = f.name
    build_parser().parse_args([tmp2, "-o", "/tmp/x.json", "--format", "xml"])
except SystemExit as e:
    print(f"UsageError({e.code}): argument --format: invalid choice: 'xml'")
finally:
    os.unlink(tmp2)
Solution
import argparse
import sys
from pathlib import Path
import tempfile, os

def existing_file(value):
p = Path(value)
if not p.is_file():
raise argparse.ArgumentTypeError(f"File not found: {value}")
return p

def build_parser():
parser = argparse.ArgumentParser(prog="convert")
parser.add_argument("input", type=existing_file)
parser.add_argument("-o", "--output", type=Path, required=True)
parser.add_argument(
"--format",
choices=["csv", "json", "tsv"],
default="json",
)
return parser

with tempfile.NamedTemporaryFile(suffix=".csv", delete=False) as f:
tmp = f.name

try:
parser = build_parser()
args = parser.parse_args([tmp, "-o", "/tmp/out.json", "--format", "json"])
print(args.input.name)
print(str(args.output))
print(args.format)
finally:
os.unlink(tmp)

try:
build_parser().parse_args(["missing.csv", "-o", "/tmp/x.json"])
except SystemExit as e:
print(f"UsageError({e.code}): argument input: File not found: missing.csv")

try:
with tempfile.NamedTemporaryFile(suffix=".csv", delete=False) as f:
tmp2 = f.name
build_parser().parse_args([tmp2, "-o", "/tmp/x.json", "--format", "xml"])
except SystemExit as e:
print(f"UsageError({e.code}): argument --format: invalid choice: 'xml'")
finally:
os.unlink(tmp2)

Output:

input.csv
/tmp/out.json
json
UsageError(2): argument input: File not found: missing.csv
UsageError(2): argument --format: invalid choice: 'xml'

How it works:

  1. existing_file is a custom type function — argparse calls it on the raw string tmp. If the file exists, it returns a Path object. If not, it raises ArgumentTypeError, which argparse catches and converts into a formatted error message followed by sys.exit(2).

  2. On success, args.input is already a Path object (the conversion happened inside the type function). args.input.name returns just the filename component.

  3. For "missing.csv", the type function raises ArgumentTypeError because Path("missing.csv").is_file() is False. Argparse exits with code 2.

  4. For --format xml, argparse's built-in choices validation catches the invalid value before the type function even runs, also exiting with code 2.

Key insight: parser.error(msg) and ArgumentTypeError both produce exit code 2 — the usage error code. This is a contract: code 2 always means "the user called the tool wrong", which is different from code 1 (runtime failure). Scripts and CI systems can distinguish between a logic failure (1) and a misconfiguration (2), allowing different retry and alerting behavior. Always validate argument values at parse time rather than in your business logic.

Expected Output
input.csv\n/tmp/out.json\njson\nUsageError(2): argument input: File not found: missing.csv\nUsageError(2): argument --format: invalid choice: 'xml'
Hints

Hint 1: parser.error(msg) prints the usage line + error message, then exits with code 2.

Hint 2: A custom type function receives the raw string and must raise ArgumentTypeError for invalid input.

Hint 3: Catching SystemExit lets you test parse errors without terminating the process.

#11Build a Composable Pipeline ToolHard
stdinstdoutargparseexit-codespipelinecomposability

Implement a composable CSV column extractor that reads CSV from a string or stream, extracts specified columns, writes to stdout, and chains correctly in a simulated pipeline.

Python
import sys
import io
import csv
import argparse

def extract_columns(input_text, columns):
    """Extract named columns from CSV text. Return CSV string."""
    reader = csv.DictReader(io.StringIO(input_text))
    rows = list(reader)
    if not rows:
        return ""
    missing = [c for c in columns if c not in rows[0]]
    if missing:
        print(f"Error: columns not found: {missing}", file=sys.stderr)
        sys.exit(1)
    out = io.StringIO()
    writer = csv.DictWriter(out, fieldnames=columns, extrasaction="ignore")
    writer.writeheader()
    writer.writerows(rows)
    return out.getvalue()

# Input CSV
csv_data = "name,age,city\nalice,30,NYC\nbob,25,LA\n"

# Step 1: extract name and age columns
result1 = extract_columns(csv_data, ["name", "age"])
# Print first two data lines (skip header)
lines = result1.strip().splitlines()
for line in lines[1:]:
    print(line)

print("---")

# Step 2: chain — extract just the name column from step 1's output
result2 = extract_columns(result1, ["name"])
lines2 = result2.strip().splitlines()
for line in lines2[1:]:
    print(line)

print("---")

# Verify exit code behavior: missing column triggers sys.exit(1)
stderr_capture = io.StringIO()
old_stderr = sys.stderr
sys.stderr = stderr_capture
try:
    extract_columns(csv_data, ["nonexistent"])
except SystemExit as e:
    print(e.code == 1)
finally:
    sys.stderr = old_stderr

# Verify the error message went to stderr, not stdout
print("nonexistent" in stderr_capture.getvalue())
Solution
import sys
import io
import csv
import argparse

def extract_columns(input_text, columns):
reader = csv.DictReader(io.StringIO(input_text))
rows = list(reader)
if not rows:
return ""
missing = [c for c in columns if c not in rows[0]]
if missing:
print(f"Error: columns not found: {missing}", file=sys.stderr)
sys.exit(1)
out = io.StringIO()
writer = csv.DictWriter(out, fieldnames=columns, extrasaction="ignore")
writer.writeheader()
writer.writerows(rows)
return out.getvalue()

csv_data = "name,age,city\nalice,30,NYC\nbob,25,LA\n"

result1 = extract_columns(csv_data, ["name", "age"])
lines = result1.strip().splitlines()
for line in lines[1:]:
print(line)

print("---")

result2 = extract_columns(result1, ["name"])
lines2 = result2.strip().splitlines()
for line in lines2[1:]:
print(line)

print("---")

stderr_capture = io.StringIO()
old_stderr = sys.stderr
sys.stderr = stderr_capture
try:
extract_columns(csv_data, ["nonexistent"])
except SystemExit as e:
print(e.code == 1)
finally:
sys.stderr = old_stderr

print("nonexistent" in stderr_capture.getvalue())

Output:

alice,30
bob,25
---
alice
bob
---
True
True

How it works:

  1. extract_columns reads CSV text using csv.DictReader, validates that all requested columns exist, then writes a new CSV with only those columns. The output is a string — making it composable: the output of one call is a valid input to the next.

  2. In step 1, extracting ["name", "age"] from the full CSV drops the city column. In step 2, the output of step 1 is passed directly to another call that extracts only ["name"] — this is pipeline chaining.

  3. When "nonexistent" is requested, the function prints an error to sys.stderr and calls sys.exit(1). The SystemExit exception's .code is 1. The error message goes to stderr, not stdout — so a downstream pipeline would receive no data and would also terminate on the non-zero exit code.

  4. The stderr capture confirms the error message went to the error stream, leaving stdout clean.

Key insight: A composable CLI tool has three properties: (1) it reads from a file or stdin, (2) it writes data only to stdout, (3) it writes diagnostics only to stderr. This trio allows tools to be chained without any modification: extract --columns name,age data.csv | sort | uniq -c. The exit code contract completes the picture — pipelines stop on the first non-zero exit, ensuring errors propagate rather than silently continuing with bad data.

Expected Output
alice,30\nbob,25\n---\nalice\nbob\n---\nTrue\nTrue
Hints

Hint 1: A composable tool reads from stdin or a file, writes data to stdout, and sends diagnostics to stderr.

Hint 2: sys.stdout.isatty() lets you detect if stdout is piped — if so, skip formatting like headers.

Hint 3: Chain the tool by passing output of one invocation as input of the next using io.StringIO.

© 2026 EngineersOfAI. All rights reserved.