Project Structure - Organizing Python Projects Like a Senior Engineer

Reading time: ~25 minutes | Level: Foundation → Engineering

# Two projects. Same logic. Wildly different outcomes.

# Project A - six months later:
from utils import helper          # which utils? there are three
from helpers import utils         # circular import, again
import config                     # is this the test config or prod config?
# File: everything.py - 2,847 lines

# Project B - six months later:
from myapp.services.auth import authenticate_user
from myapp.models.user import User
from myapp.config import settings
# Every import tells a story. Every file has one job.

Project structure is the first decision you make and the last thing you want to refactor. Get it right at the start, and the codebase grows predictably. Get it wrong, and every new feature fights the structure you chose on day one.

This lesson covers how senior engineers think about structure - not as a style preference, but as an architectural decision that determines how hard the next six months of development will be.

What You Will Learn

The three project sizes and which structure fits each
The src layout and why it prevents an entire class of subtle bugs
How to write a complete pyproject.toml from scratch
What belongs in __init__.py and what absolutely does not
How to define clean module boundaries so circular imports never happen
When and how to split a growing module into a sub-package
How to make a package runnable with python -m mypackage
Virtual environment best practices and requirements management

Prerequisites

Comfortable writing Python functions and classes across multiple files
Basic understanding of import statements and how Python finds modules
Familiarity with the terminal and running Python scripts

The Three Project Sizes

Before picking a structure, identify which size project you have. Over-engineering a script is waste. Under-engineering an application is debt.

Size 1: The Script

A script is a single file that does one job. It is not imported by anything else. It is run directly.

project/
├── process_csv.py
├── requirements.txt   # optional
└── README.md

When to use it: Data processing one-offs, automation tasks, personal utilities. If the entire logic fits in one file and you never need to import it from another project, keep it a script.

The trap: Scripts grow. "I'll just add one more function" is how you end up with a 1,200-line script that nobody can understand. When a script exceeds roughly 200 lines or you start thinking "I need to test this," it's time to promote it to a package.

Size 2: The Package

A package is importable. It has an __init__.py. It can be installed with pip. It is tested with pytest.

project/
├── mypackage/
│   ├── __init__.py
│   ├── core.py
│   └── utils.py
├── tests/
│   └── test_core.py
├── pyproject.toml
└── README.md

When to use it: Libraries, shared utilities, tools you want to distribute or reuse across projects.

Size 3: The Application

An application has a user-facing interface (CLI, API, web), configuration management, external dependencies, and usually a deployment story.

project/
├── src/
│   └── myapp/
│       ├── __init__.py
│       ├── models.py
│       ├── services.py
│       ├── routes.py
│       └── config.py
├── tests/
├── docs/
├── pyproject.toml
├── .env.example
└── README.md

When to use it: APIs, web applications, complex CLI tools, anything with more than one interface layer.

The src Layout - The Modern Standard

The most important structural decision for any package or application is whether to use the src layout.

The Problem With Root-Level Packages

The naive approach puts your package at the project root:

myproject/
├── mypackage/        # package at root
│   ├── __init__.py
│   └── core.py
├── tests/
└── pyproject.toml

This looks fine but has a critical flaw. When you run pytest or python from myproject/, Python adds the current directory to sys.path. This means Python finds your mypackage/ directory directly - without installing it.

The consequence: your tests pass against the uninstalled source tree, not against the installed package. If you have a bug in your pyproject.toml that causes a file to be excluded from the installed package, your tests still pass because they never use the installed version. You discover the bug only when a user installs your package and it breaks.

The src Layout Solution

myproject/
├── src/
│   └── mypackage/    # package inside src/
│       ├── __init__.py
│       └── core.py
├── tests/
└── pyproject.toml

With this layout, src/ is NOT automatically on sys.path. The only way Python can find mypackage is if you have installed it - even in editable mode with pip install -e .. This guarantees your tests run against the installed package, exactly as your users will experience it.

# First-time setup with src layout
python -m venv .venv
source .venv/bin/activate        # or .venv\Scripts\activate on Windows
pip install -e ".[dev]"          # install in editable mode with dev extras

# Now mypackage is importable from anywhere
python -c "import mypackage; print(mypackage.__version__)"

Full Application Structure

Here is the complete structure for a production Python application:

myproject/
├── src/
│   └── myapp/
│       ├── __init__.py          # version, public API
│       ├── __main__.py          # python -m myapp entry point
│       ├── config.py            # settings from env vars / config files
│       ├── models.py            # data structures (dataclasses, Pydantic)
│       ├── services.py          # business logic
│       ├── routes.py            # CLI or HTTP interface layer
│       ├── utils.py             # generic, stateless helpers
│       └── exceptions.py        # custom exception hierarchy
├── tests/
│   ├── __init__.py
│   ├── conftest.py              # pytest fixtures
│   ├── test_models.py
│   ├── test_services.py
│   └── integration/
│       └── test_end_to_end.py
├── docs/
│   ├── index.md
│   └── api.md
├── scripts/
│   └── seed_data.py             # one-off scripts, not part of the package
├── pyproject.toml               # single source of truth for all config
├── .env.example                 # template for environment variables
├── .gitignore
└── README.md

pyproject.toml - The Complete Picture

pyproject.toml replaced setup.py, setup.cfg, requirements.txt, and a half-dozen separate config files for tox, mypy, and pytest. It is the single source of truth for your entire project configuration.

Build System

[build-system]
requires = ["setuptools>=68", "setuptools-scm"]
build-backend = "setuptools.backends.legacy:build"

This tells pip which tool to use to build your package. setuptools is the default. Alternatives include flit-core (simpler) and hatchling (more features).

Project Metadata

[project]
name = "myapp"
version = "0.1.0"
description = "A professional Python application"
readme = "README.md"
license = { file = "LICENSE" }
authors = [
    { name = "Your Name", email = "[email protected]" },
]
requires-python = ">=3.11"
dependencies = [
    "httpx>=0.27",
    "pydantic>=2.0",
    "click>=8.1",
]

[project.optional-dependencies]
dev = [
    "pytest>=8.0",
    "pytest-cov",
    "mypy>=1.8",
    "ruff>=0.3",
    "black>=24.0",
]
docs = [
    "mkdocs>=1.5",
    "mkdocs-material",
]

Key decisions in dependencies:

Pin a minimum version (>=0.27), not an exact version (==0.27.0). Exact pinning in a library forces conflicts on users.
Use optional-dependencies for dev tools. Users should not install pytest when they pip install myapp.
For applications (not libraries), you CAN pin exact versions for reproducibility, but manage this through a lockfile (see Requirements Management below).

Entry Points - Making Your Package a Command

[project.scripts]
myapp = "myapp.routes:main"
myapp-admin = "myapp.admin:admin_main"

After pip install myapp, the user can run myapp on the command line and Python calls main() from myapp/routes.py. This is cleaner than shipping shell scripts or requiring python -m myapp.

Tool Configuration

[tool.setuptools.packages.find]
where = ["src"]                  # tells setuptools where packages live

[tool.black]
line-length = 88
target-version = ["py311", "py312"]

[tool.ruff]
line-length = 88
target-version = "py311"

[tool.ruff.lint]
select = ["E", "F", "I", "N", "UP", "B"]
ignore = ["E501"]

[tool.mypy]
python_version = "3.11"
strict = true
ignore_missing_imports = false

[tool.pytest.ini_options]
testpaths = ["tests"]
addopts = "-v --tb=short --cov=src/myapp --cov-report=term-missing"
filterwarnings = [
    "error",
    "ignore::DeprecationWarning:pkg_resources",
]

[tool.coverage.run]
source = ["src/myapp"]
omit = ["*/tests/*", "*/__main__.py"]

Having all configuration in one file means:

New team members run pip install -e ".[dev]" and everything works
CI has one file to read, not six
Tool settings are version-controlled alongside the code they govern

`init.py` - Defining Your Public API

__init__.py is what makes a directory a Python package. But it does something more important: it defines your public API - what users of your package are supposed to import.

What Belongs in `init.py`

# src/myapp/__init__.py

"""
myapp - A professional Python application.

Public API:
    - MyModel: the primary data model
    - process: the main processing function
    - MyAppError: base exception class
"""

__version__ = "0.1.0"
__author__ = "Your Name"

# Re-export the public interface
from myapp.models import MyModel
from myapp.services import process
from myapp.exceptions import MyAppError

__all__ = ["MyModel", "process", "MyAppError", "__version__"]

By re-exporting from __init__.py, users can write:

from myapp import MyModel, process
# instead of:
from myapp.models import MyModel
from myapp.services import process

The internal module organization is an implementation detail. The public API is stable.

What Does NOT Belong in `init.py`

# BAD - never do this in __init__.py

# Don't put business logic here
def process_data(data):
    ...  # 50 lines of logic in __init__.py

# Don't put expensive imports at module level (they run on every import)
import tensorflow as tf          # slow import blocks every caller
import numpy as np

# Don't execute code with side effects at import time
database.connect()               # nightmare for testing

# Don't use star imports - they hide what you're actually using
from myapp.services import *

The rule: __init__.py should contain declarations (version, __all__, re-exports), never computation.

Nested Packages: Sub-package `init.py`

# src/myapp/auth/__init__.py
from myapp.auth.jwt import create_token, verify_token
from myapp.auth.password import hash_password, check_password

__all__ = ["create_token", "verify_token", "hash_password", "check_password"]

Module Boundaries - One Responsibility Per Module

The single most important structural rule: each module does one thing.

The Standard Module Roles

myapp/
├── models.py       # Data structures - what does the data look like?
├── services.py     # Business logic - what can you do with the data?
├── routes.py       # Interface layer - how does the outside world call services?
├── config.py       # Configuration - what environment does the app run in?
├── utils.py        # Generic helpers - pure functions with no domain knowledge
└── exceptions.py   # Error hierarchy - what can go wrong?

models.py - data shapes, nothing else:

# models.py
from dataclasses import dataclass, field
from datetime import datetime
from typing import Optional

@dataclass
class User:
    id: int
    email: str
    name: str
    created_at: datetime = field(default_factory=datetime.now)
    is_active: bool = True

@dataclass
class Order:
    id: int
    user_id: int
    items: list[str]
    total: float
    created_at: datetime = field(default_factory=datetime.now)

Notice: no database calls, no validation logic, no HTTP requests. Pure data.

services.py - business logic, imports from models.py:

# services.py
from myapp.models import User, Order
from myapp.exceptions import UserNotFoundError, InsufficientFundsError

def create_order(user: User, items: list[str], prices: dict[str, float]) -> Order:
    if not user.is_active:
        raise UserNotFoundError(f"User {user.id} is not active")

    total = sum(prices[item] for item in items)
    return Order(id=_next_id(), user_id=user.id, items=items, total=total)

def apply_discount(order: Order, discount_pct: float) -> Order:
    if not 0 <= discount_pct <= 100:
        raise ValueError(f"Invalid discount: {discount_pct}")
    discounted_total = order.total * (1 - discount_pct / 100)
    return Order(**{**vars(order), "total": discounted_total})

routes.py - CLI or HTTP handlers, thin layer over services:

# routes.py - a Click CLI interface
import click
from myapp.services import create_order, apply_discount
from myapp.models import User

@click.group()
def cli():
    """Order management tool."""

@cli.command()
@click.argument("user_id", type=int)
@click.argument("items", nargs=-1)
def order(user_id: int, items: tuple[str, ...]):
    """Create a new order."""
    user = _load_user(user_id)
    new_order = create_order(user, list(items), _load_prices())
    click.echo(f"Order {new_order.id} created. Total: ${new_order.total:.2f}")

utils.py \text{---} generic helpers with zero domain knowledge:

# utils.py
import hashlib
import re
from pathlib import Path

def slugify(text: str) -> str:
    """Convert a string to a URL-safe slug."""
    text = text.lower().strip()
    text = re.sub(r"[^\w\s-]", "", text)
    return re.sub(r"[-\s]+", "-", text)

def sha256_file(path: Path) -> str:
    """Compute the SHA-256 hash of a file."""
    h = hashlib.sha256()
    with path.open("rb") as f:
        for chunk in iter(lambda: f.read(65536), b""):
            h.update(chunk)
    return h.hexdigest()

def chunk(lst: list, size: int) -> list[list]:
    """Split a list into chunks of at most `size` elements."""
    return [lst[i:i + size] for i in range(0, len(lst), size)]

Notice: no imports from myapp itself. utils.py is completely portable.

Circular Imports: The Design Smell

Circular imports happen when module A imports from module B, and module B imports from module A.

# BAD: circular import
# models.py
from myapp.services import validate_user    # models importing from services!

# services.py
from myapp.models import User               # services importing from models

This will cause an ImportError or, worse, subtle None values where you expect module objects.

The fix is structural, not syntactic. If two modules need each other, one of them has too many responsibilities. The standard fix:

Move the shared logic to a third module that both can import
Restructure so the dependency flows in one direction: models → services → routes
Use lazy imports inside functions (last resort \text{---} it hides the design problem)

The dependency flow in a well-structured project is always one-directional:

config.py     → (nothing)
exceptions.py → (nothing)
utils.py      → (nothing)
models.py     → exceptions
services.py   → models, exceptions, config
routes.py     → services, models

When to Split Into Sub-packages

A module has grown too large when:

It exceeds roughly 500 lines
It has more than 3 distinct responsibilities
You find yourself writing separator comments like # --- AUTH SECTION --- inside a single file

Convert a module to a sub-package by turning it into a directory:

# Before: myapp/services.py (800 lines)

# After:
myapp/services/
├── __init__.py       # re-exports the public API
├── auth.py           # authentication logic
├── orders.py         # order processing logic
└── notifications.py  # email/SMS notification logic

# myapp/services/__init__.py
from myapp.services.auth import authenticate, create_session
from myapp.services.orders import create_order, cancel_order
from myapp.services.notifications import send_confirmation

__all__ = [
    "authenticate", "create_session",
    "create_order", "cancel_order",
    "send_confirmation",
]

The key: callers do not need to change. Code that did from myapp.services import create_order still works exactly the same. The split is an internal refactor with a stable external API.

`main.py` \text{---} Making a Package Runnable

# src/myapp/__main__.py

"""
Entry point for `python -m myapp`.

Usage:
    python -m myapp                    # show help
    python -m myapp process input.csv
    python -m myapp --version
"""

from myapp.routes import cli

if __name__ == "__main__":
    cli()

With this file, users can run:

python -m myapp process input.csv
# equivalent to the installed script:
myapp process input.csv

This is particularly useful for:

Development before installing the entry point
Running the package from a Docker container or CI environment
Allowing python -m as an alternative invocation to the installed command

`.gitignore` for Python Projects

# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# Distribution / packaging
dist/
build/
*.egg-info/
*.egg
MANIFEST

# Virtual environments
.venv/
venv/
env/
ENV/

# Environment variables - NEVER commit secrets
.env
.env.local
.env.*.local

# Testing
.pytest_cache/
.coverage
htmlcov/
.tox/

# Type checking
.mypy_cache/

# IDEs
.idea/
.vscode/
*.swp
*.swo

# OS-specific
.DS_Store
Thumbs.db

The most critical entries: .env (contains secrets) and .venv/ (never commit your virtual environment - it contains absolute paths and platform-specific binaries).

Virtual Environments - Best Practices

Creating and Activating

# Create a virtual environment in the project root
python -m venv .venv

# Activate (macOS/Linux)
source .venv/bin/activate

# Activate (Windows)
.venv\Scripts\activate

# Verify you're in the right environment
which python        # should show .venv/bin/python
python --version    # confirm the Python version

# Deactivate when done
deactivate

Why .venv in the project root (and not ~/.virtualenvs/)?

It's discoverable - any developer cloning the repo knows where to look
Most editors (VS Code, PyCharm) auto-detect it
It's easy to delete and recreate: rm -rf .venv && python -m venv .venv

Development Setup - The Full Workflow

# Clone repo
git clone https://github.com/org/myapp
cd myapp

# Create environment
python -m venv .venv
source .venv/bin/activate

# Install everything (package + dev tools)
pip install -e ".[dev]"

# Verify installation
myapp --version          # entry point works
pytest                   # all tests pass
mypy src/                # no type errors

Requirements Management

There are three approaches, each with trade-offs:

Approach 1: `requirements.txt` (legacy, still common)

# requirements.txt - for applications that need exact pinned versions
httpx==0.27.0
pydantic==2.6.1
click==8.1.7

# requirements-dev.txt - dev tools
# -r requirements.txt
pytest==8.1.1
mypy==1.9.0
black==24.2.0

Problem: You manually maintain versions. When you upgrade one package, its transitive dependencies might conflict. This is error-prone at scale.

Approach 2: `pyproject.toml` with ranges (modern)

[project]
dependencies = [
    "httpx>=0.27",
    "pydantic>=2.0,<3.0",
    "click>=8.1",
]

Ranges allow flexibility but don't guarantee reproducibility. Two developers might get different versions depending on when they installed.

Approach 3: `pip-compile` - The Best of Both Worlds

pip install pip-tools

# requirements.in - human-managed, just your direct deps
# httpx>=0.27
# pydantic>=2.0
# click>=8.1

# Generate the lockfile
pip-compile requirements.in -o requirements.txt

The generated requirements.txt is fully pinned with all transitive dependencies resolved. Commit both files: requirements.in (intent) and requirements.txt (lockfile). Upgrade with:

pip-compile --upgrade requirements.in
# then commit the updated requirements.txt

Modern alternative: Use uv (Rust-based, extremely fast) for the entire workflow:

# uv replaces pip, pip-tools, and virtualenv
uv venv                          # create .venv
uv pip install -e ".[dev]"       # install deps in milliseconds
uv pip compile pyproject.toml    # generate lockfile

Putting It All Together - A Complete Example

invoicegen/
├── src/
│   └── invoicegen/
│       ├── __init__.py
│       ├── __main__.py
│       ├── config.py
│       ├── exceptions.py
│       ├── models.py
│       ├── services.py
│       ├── routes.py
│       └── utils.py
├── tests/
│   ├── __init__.py
│   ├── conftest.py
│   ├── test_models.py
│   ├── test_services.py
│   └── test_utils.py
├── docs/
│   └── index.md
├── pyproject.toml
├── .env.example
├── .gitignore
└── README.md

# src/invoicegen/__init__.py
"""invoicegen - Generate professional PDF invoices from the command line."""

__version__ = "1.0.0"

from invoicegen.models import Invoice, LineItem
from invoicegen.services import generate_invoice, calculate_total
from invoicegen.exceptions import InvoiceError

__all__ = ["Invoice", "LineItem", "generate_invoice", "calculate_total", "InvoiceError"]

# pyproject.toml
[build-system]
requires = ["setuptools>=68"]
build-backend = "setuptools.backends.legacy:build"

[project]
name = "invoicegen"
version = "1.0.0"
description = "Generate professional PDF invoices from the command line"
requires-python = ">=3.11"
dependencies = [
    "click>=8.1",
    "reportlab>=4.0",
    "pydantic>=2.0",
]

[project.optional-dependencies]
dev = ["pytest>=8.0", "pytest-cov", "mypy>=1.8", "ruff>=0.3"]

[project.scripts]
invoicegen = "invoicegen.routes:cli"

[tool.setuptools.packages.find]
where = ["src"]

[tool.pytest.ini_options]
testpaths = ["tests"]
addopts = "--tb=short --cov=src/invoicegen"

[tool.mypy]
strict = true

Interview Questions

Q1: What is the src layout and why do senior engineers prefer it?

Answer: The src layout places your package inside a src/ directory rather than at the project root. Without it, Python's sys.path automatically includes the project root, which means your tests can import the package directly from the source tree - even if the installed package is broken or missing files. With the src layout, the only way Python can find your package is through the installed version (even editable installs via pip install -e .). This guarantees your tests validate the same artifact your users will install. It also enforces that your pyproject.toml is correct, because installation is required before anything works. The trade-off is a slightly more complex initial setup, which is why many small scripts and tutorials omit it.

Q2: What should and should not go in `init.py`?

Answer: __init__.py should contain version metadata (__version__), the __all__ list, and re-exports that define your package's public API. By re-exporting key classes and functions, you decouple the internal module organization from the public interface - callers use from mypackage import UserModel regardless of whether UserModel lives in models.py or models/user.py. What should NOT be in __init__.py: business logic, expensive computations, database connections, or any code with side effects. Everything in __init__.py runs every time any part of your package is imported, so side effects there become hidden traps. Star imports (from module import *) should also be avoided - use explicit re-exports instead.

Q3: How do you diagnose and fix a circular import error?

Answer: Circular imports occur when module A imports from B and B imports from A, creating a dependency cycle. Python handles this by partially executing the first module before the import is resolved, which often results in ImportError: cannot import name 'X' or, more subtly, attributes that are None at import time. The fix is architectural: identify which module has too many responsibilities. Usually one module is doing both data definition AND business logic. Move shared types to a third module that both can import without cycles. The ideal dependency flow is strictly one-directional: utils and exceptions at the base, then models, then services, then routes. If you find yourself needing a two-way dependency, that's a signal to extract a shared abstraction. Lazy imports inside functions can break the cycle mechanically but should be treated as a temporary fix while you redesign the modules.

Q4: When should you split a module into a sub-package?

Answer: The practical triggers are: the module exceeds roughly 500 lines, it has three or more distinct responsibilities, or you find yourself adding section-separator comments inside a single file. When these conditions appear, the module is doing too much. To split, create a directory with the same name as the module, move the logical sections into separate files within it, and add an __init__.py that re-exports the original public interface. This last step is critical: callers should not need to change their imports. The public API remains stable while the internal structure improves. This is an example of the open/closed principle applied at the module level - open for extension (add new files), closed for modification (callers see no change).

Q5: What is the difference between `dependencies` and `optional-dependencies` in `pyproject.toml`?

Answer: dependencies lists packages that every user of your package needs - they are installed automatically when someone runs pip install mypackage. optional-dependencies groups packages that only specific users need - typically dev (testing and linting tools), docs (documentation generators), or feature-specific extras like [postgres]. Users install optional groups with pip install mypackage[dev]. The key engineering principle is to never include development tools in core dependencies. If a user installs your library and gets pytest, mypy, and black pulled in as transitive dependencies, that's a significant annoyance and potential source of version conflicts. Optional dependencies let developers get full tooling while production deployments stay lean.

Q6: What is `main.py` and why does it matter for packaging?

Answer: __main__.py is executed when you run a package with python -m mypackage. It provides an entry point that works before an installed script entry point exists - useful during development, in Docker containers, and in environments where you cannot install the package system-wide. It also serves as documentation: a new developer reading the project immediately knows they can run python -m mypackage to try the tool. The common pattern is to keep __main__.py minimal - one or two lines that import and call the main CLI function. This keeps the real logic testable and the entry point trivially simple. The installed script entry point (defined in pyproject.toml under [project.scripts]) and __main__.py should call the same underlying function.

Practice Challenges

Beginner - Create a Well-Structured Package

Take a single-file script that reads a CSV, computes statistics, and prints a report. Restructure it as a proper src-layout package with src/csvstats/__init__.py, models.py, services.py, utils.py, a pyproject.toml, and a basic test file.

Solution

csvstats/
├── src/
│   └── csvstats/
│       ├── __init__.py
│       ├── models.py
│       ├── services.py
│       └── utils.py
├── tests/
│   └── test_services.py
└── pyproject.toml

# src/csvstats/models.py
from dataclasses import dataclass

@dataclass
class ColumnStats:
    name: str
    count: int
    mean: float
    min: float
    max: float
    std: float

# src/csvstats/services.py
import csv
import statistics
from pathlib import Path
from csvstats.models import ColumnStats

def compute_stats(path: Path) -> list[ColumnStats]:
    with path.open() as f:
        rows = list(csv.DictReader(f))

    if not rows:
        return []

    results = []
    for col in rows[0].keys():
        values = [float(r[col]) for r in rows if r[col].strip()]
        results.append(ColumnStats(
            name=col,
            count=len(values),
            mean=statistics.mean(values),
            min=min(values),
            max=max(values),
            std=statistics.stdev(values) if len(values) > 1 else 0.0,
        ))
    return results

# src/csvstats/__init__.py
__version__ = "0.1.0"
from csvstats.models import ColumnStats
from csvstats.services import compute_stats
__all__ = ["ColumnStats", "compute_stats", "__version__"]

# pyproject.toml
[build-system]
requires = ["setuptools>=68"]
build-backend = "setuptools.backends.legacy:build"

[project]
name = "csvstats"
version = "0.1.0"
requires-python = ">=3.11"
dependencies = []

[project.optional-dependencies]
dev = ["pytest>=8.0"]

[tool.setuptools.packages.find]
where = ["src"]

# tests/test_services.py
from pathlib import Path
from csvstats.services import compute_stats

def test_compute_stats_basic(tmp_path):
    f = tmp_path / "data.csv"
    f.write_text("a,b\n1,10\n2,20\n3,30\n")
    stats = compute_stats(f)
    assert len(stats) == 2
    assert stats[0].name == "a"
    assert stats[0].mean == 2.0
    assert stats[0].min == 1.0
    assert stats[0].max == 3.0

Intermediate - Fix Circular Imports

You have three modules with a circular dependency. Refactor them so the dependency graph is acyclic:

# user.py - imports from orders (creates a cycle)
from orders import get_order_count

# orders.py - imports from user (creates a cycle)
from user import User

# services.py
from user import User
from orders import Order

Solution

Extract the data-only types into a models.py that neither user.py nor orders.py needs to reference each other.

# models.py - no dependency on user.py or orders.py
from dataclasses import dataclass

@dataclass
class User:
    id: int
    email: str

@dataclass
class Order:
    id: int
    user_id: int
    amount: float

# user.py - imports only from models
from myapp.models import User

def get_user(user_id: int) -> User:
    ...  # fetch from DB

# orders.py - imports only from models
from myapp.models import Order

def get_orders_for_user(user_id: int) -> list[Order]:
    ...  # fetch from DB

def get_order_count(user_id: int) -> int:
    return len(get_orders_for_user(user_id))

# services.py - the only module that imports both
from myapp.models import User, Order
from myapp.user import get_user
from myapp.orders import get_orders_for_user, get_order_count

def user_summary(user_id: int) -> dict:
    user = get_user(user_id)
    count = get_order_count(user_id)
    return {"user": user, "order_count": count}

Dependency flow: models → user, orders → services. No cycles.

Advanced - Full Application Scaffold Generator

Write a Python script that generates a complete src-layout project scaffold when given a project name. It should create the full directory structure, generate a valid pyproject.toml, a .gitignore, __init__.py, __main__.py, and stub modules, then print a summary of what was created.

Solution

#!/usr/bin/env python3
"""
scaffold.py - Generate a complete src-layout Python project scaffold.
Usage: python scaffold.py myprojectname
"""
import sys
from pathlib import Path

PYPROJECT_TEMPLATE = """\
[build-system]
requires = ["setuptools>=68"]
build-backend = "setuptools.backends.legacy:build"

[project]
name = "{name}"
version = "0.1.0"
description = "A short description of {name}"
requires-python = ">=3.11"
dependencies = []

[project.optional-dependencies]
dev = [
    "pytest>=8.0",
    "pytest-cov",
    "mypy>=1.8",
    "ruff>=0.3",
]

[project.scripts]
{name} = "{name}.routes:cli"

[tool.setuptools.packages.find]
where = ["src"]

[tool.pytest.ini_options]
testpaths = ["tests"]
addopts = "--tb=short"

[tool.mypy]
strict = true
"""

GITIGNORE = """\
__pycache__/
*.py[cod]
dist/
build/
*.egg-info/
.venv/
.env
.pytest_cache/
.mypy_cache/
.coverage
htmlcov/
.DS_Store
"""

INIT_TEMPLATE = """\
\"\"\"{name} - short description.\"\"\"

__version__ = "0.1.0"
"""

MAIN_TEMPLATE = """\
\"\"\"Entry point for `python -m {name}`.\"\"\"\nfrom {name}.routes import cli

if __name__ == "__main__":
    cli()
"""

ROUTES_TEMPLATE = """\
\"\"\"CLI interface for {name}.\"\"\"\nimport sys

def cli() -> None:
    print("{name} is running")
    sys.exit(0)
"""


def scaffold(name: str) -> None:
    root = Path(name)

    dirs = [root / "src" / name, root / "tests", root / "docs"]
    for d in dirs:
        d.mkdir(parents=True, exist_ok=True)
        print(f"  created  {d}/")

    files = {
        root / "pyproject.toml": PYPROJECT_TEMPLATE.format(name=name),
        root / ".gitignore": GITIGNORE,
        root / "README.md": f"# {name}\n\nA short description.\n",
        root / "src" / name / "__init__.py": INIT_TEMPLATE.format(name=name),
        root / "src" / name / "__main__.py": MAIN_TEMPLATE.format(name=name),
        root / "src" / name / "models.py": "# Data structures\n",
        root / "src" / name / "services.py": "# Business logic\n",
        root / "src" / name / "routes.py": ROUTES_TEMPLATE.format(name=name),
        root / "src" / name / "utils.py": "# Generic helpers\n",
        root / "src" / name / "exceptions.py": (
            f"class {name.title()}Error(Exception):\n    pass\n"
        ),
        root / "tests" / "__init__.py": "",
        root / "tests" / "conftest.py": "# pytest fixtures\nimport pytest\n",
        root / "tests" / f"test_{name}.py": "def test_placeholder():\n    pass\n",
    }

    for path, content in files.items():
        path.write_text(content)
        print(f"  created  {path}")

    print(f"\nProject '{name}' scaffolded successfully.")
    print(f"\nNext steps:")
    print(f"  cd {name}")
    print(f"  python -m venv .venv")
    print(f"  source .venv/bin/activate")
    print(f"  pip install -e '.[dev]'")
    print(f"  pytest")


if __name__ == "__main__":
    if len(sys.argv) != 2:
        print(f"Usage: python {sys.argv[0]} <project-name>")
        sys.exit(2)
    scaffold(sys.argv[1])

Quick Reference

Concept	Recommendation
Package layout	Use src layout for packages and applications
Configuration	Single `pyproject.toml` for all tool config
`__init__.py`	Version, `__all__`, re-exports only - no logic
Module size limit	~500 lines before splitting into sub-package
Virtual environment	`python -m venv .venv` in project root
Install for dev	`pip install -e ".[dev]"`
Dependency pinning	`pyproject.toml` ranges + `pip-compile` lockfile
Circular imports	Always a design problem - restructure, don't work around
Entry points	Define in `[project.scripts]` + add `__main__.py`
Secrets	Never commit `.env` - commit `.env.example` instead

Key Takeaways

The src layout is the modern standard because it forces tests to run against the installed package, not the raw source tree - catching a whole class of packaging bugs before users do.
pyproject.toml is the single source of truth for your project: metadata, dependencies, and all tool configuration live there, version-controlled alongside the code.
__init__.py defines your public API through explicit re-exports. It should contain declarations only - no logic, no expensive imports, no side effects.
Module boundaries should follow the data flow: models defines shapes, services operates on them, routes exposes them, utils helps everyone. Dependencies flow in one direction.
Circular imports are always a design smell indicating a module has too many responsibilities. The fix is always structural, not syntactic.
Virtual environments belong in .venv/ at the project root, never committed to git. Your .gitignore must exclude .venv/, .env, dist/, and __pycache__/.
Project structure is an early decision with long-lasting consequences. A few minutes of planning at the start prevents weeks of refactoring later.

What You Will Learn​

Prerequisites​

The Three Project Sizes​

Size 1: The Script​

Size 2: The Package​

Size 3: The Application​

The src Layout - The Modern Standard​

The Problem With Root-Level Packages​

The src Layout Solution​

Full Application Structure​

pyproject.toml - The Complete Picture​

Build System​

Project Metadata​

Entry Points - Making Your Package a Command​

Tool Configuration​

__init__.py - Defining Your Public API​

What Belongs in __init__.py​

What Does NOT Belong in __init__.py​

Nested Packages: Sub-package __init__.py​

Module Boundaries - One Responsibility Per Module​

The Standard Module Roles​

Circular Imports: The Design Smell​

When to Split Into Sub-packages​

__main__.py \text{---} Making a Package Runnable​

.gitignore for Python Projects​

Virtual Environments - Best Practices​

Creating and Activating​

Development Setup - The Full Workflow​

Requirements Management​

Approach 1: requirements.txt (legacy, still common)​

Approach 2: pyproject.toml with ranges (modern)​

Approach 3: pip-compile - The Best of Both Worlds​

Putting It All Together - A Complete Example​

Interview Questions​

Q1: What is the src layout and why do senior engineers prefer it?​

Q2: What should and should not go in __init__.py?​

Q3: How do you diagnose and fix a circular import error?​

Q4: When should you split a module into a sub-package?​

Q5: What is the difference between dependencies and optional-dependencies in pyproject.toml?​

Q6: What is __main__.py and why does it matter for packaging?​

Practice Challenges​

Beginner - Create a Well-Structured Package​

Intermediate - Fix Circular Imports​

Advanced - Full Application Scaffold Generator​

Quick Reference​

Key Takeaways​

What You Will Learn

Prerequisites

The Three Project Sizes

Size 1: The Script

Size 2: The Package

Size 3: The Application

The src Layout - The Modern Standard

The Problem With Root-Level Packages

The src Layout Solution

Full Application Structure

pyproject.toml - The Complete Picture

Build System

Project Metadata

Entry Points - Making Your Package a Command

Tool Configuration

`init.py` - Defining Your Public API

What Belongs in `init.py`

What Does NOT Belong in `init.py`

Nested Packages: Sub-package `init.py`

Module Boundaries - One Responsibility Per Module

The Standard Module Roles

Circular Imports: The Design Smell

When to Split Into Sub-packages

`main.py` \text{---} Making a Package Runnable

`.gitignore` for Python Projects

Virtual Environments - Best Practices

Creating and Activating

Development Setup - The Full Workflow

Requirements Management

Approach 1: `requirements.txt` (legacy, still common)

Approach 2: `pyproject.toml` with ranges (modern)

Approach 3: `pip-compile` - The Best of Both Worlds

Putting It All Together - A Complete Example

Interview Questions

Q1: What is the src layout and why do senior engineers prefer it?

Q2: What should and should not go in `init.py`?

Q3: How do you diagnose and fix a circular import error?

Q4: When should you split a module into a sub-package?

Q5: What is the difference between `dependencies` and `optional-dependencies` in `pyproject.toml`?

Q6: What is `main.py` and why does it matter for packaging?

Practice Challenges

Beginner - Create a Well-Structured Package

Intermediate - Fix Circular Imports

Advanced - Full Application Scaffold Generator

Quick Reference

Key Takeaways