Skip to main content

Module 03 - Python Internals

Reading time: ~10 minutes | Level: Intermediate → Engineering

Before reading further, predict the exact output of this program:

import sys

a = 256
b = 256
print(a is b) # ?

a = 257
b = 257
print(a is b) # ?

Most developers who know Python well will hesitate on the second one. The actual output:

True
False

The integer 256 is literally the same object in both variables. The integer 257 is not. Two different objects with the same value.

This is not a bug. It is a deliberate implementation decision in CPython: integers from -5 to 256 are pre-allocated at interpreter startup and reused for every reference to those values. This is the integer cache. Every time you write a = 256, CPython points a at the same pre-allocated PyLongObject in memory.

Most Python developers never know this exists. Engineers who understand CPython know exactly why it exists, when it matters, and - critically - why you should never write code that depends on it.

That is the difference this module builds.

What This Module Covers

Every Python program you write runs inside CPython - a C program that tokenises your source, compiles it to bytecode, and executes that bytecode instruction by instruction in a tight eval loop. Every list you create, every function you call, every with statement you write - all of it maps to specific operations in CPython's memory model and execution engine.

This module opens the interpreter. You will see what a Python object actually looks like in memory, how the bytecode compiler turns your source into instructions, how the eval loop executes them, why the GIL exists and what it actually prevents, how CPython tracks and reclaims memory, and how the import system finds and loads modules.

This is not academic. It is directly useful for:

  • Debugging memory leaks - understanding reference counting lets you find the object keeping a reference alive
  • Performance profiling - dis, sys, and memory profilers reveal exactly where time and memory are going
  • Writing extensions - C extensions, Cython, and cffi all require understanding PyObject layout
  • Understanding async - the GIL and the event loop interact in specific, documentable ways
  • Reading error messages - CPython's tracebacks reference frame objects, code objects, and line tables you will now recognise

By the end of this module, you will look at a Python traceback and see the frame stack. You will read a dis.dis() output and understand every instruction. You will know why import is expensive the first time and free the second time. You will be able to explain the GIL to a colleague in two minutes and be right.

Module Topics

#LessonWhat It Covers
01CPython ArchitectureThe reference implementation, the execution pipeline from .py to result, the eval loop, PyObject layout in memory, the integer cache, string interning, and the small object allocator
02Bytecode InspectionThe code object and all its attributes, how .pyc files work, reading bytecode with marshal, the line number table, and how closures and decorators look in bytecode
03Disassembly with disThe dis module API, reading disassembly output, key opcodes explained with examples, value stack evolution, comparing equivalent Python patterns at the bytecode level
04GIL ExplainedWhat the GIL is, why CPython has it, what it actually protects, how it affects threads vs processes, the GIL release in I/O and C extensions, and when it matters in practice
05Reference Countingob_refcnt, sys.getrefcount(), how CPython tracks object lifetimes, Py_INCREF/Py_DECREF, cycles and why refcounting alone is insufficient
06Garbage CollectionThe cyclic garbage collector, generational collection, gc module, gc.collect(), __del__ and finalizers, weak references, and how to find and fix reference cycles
07Memory Profilingtracemalloc, memory_profiler, objgraph, sys.getsizeof(), reading memory profiles, finding leaks in long-running servers, and memory-efficient patterns
08sys and inspectsys module for interpreter state, inspect for live introspection - functions, frames, source, signatures - and how debuggers and test frameworks use these
09Import SystemThe import machinery, sys.modules cache, finders and loaders, importlib, __init__.py, namespace packages, and writing custom import hooks

Module Projects

ProjectCore Skills
Bytecode Visualizerdis, code object inspection, formatting bytecode output, comparing Python versions, building a readable disassembly report
Mini Profiler Tooltracemalloc, sys, inspect, frame walking, measuring memory and call counts, producing a flame-graph-style text report

Prerequisites

  • Python Foundation course complete (or equivalent depth)
  • Module 01 - Object-Oriented Programming (particularly __dunder__ methods and object lifecycle)
  • Module 02 - Functional Programming (generators and closures make more sense once you understand frame objects)
  • Comfortable reading Python tracebacks

How to Use This Module

Lessons build on each other but are also independently useful as reference material.

CPython Architecture (01) establishes the mental model everything else builds on - read it first. Bytecode Inspection (02) and Disassembly (03) go together; read them back to back. The GIL (04) and Reference Counting (05) and Garbage Collection (06) form a connected trio on memory management. Memory Profiling (07) applies those concepts to real debugging. sys and inspect (08) give you the runtime introspection tools. The Import System (09) closes the module by showing how Python finds and loads code.

The opening puzzle of every lesson is not decorative. Attempt a genuine prediction before reading. If you are wrong - and you will be wrong on several of them - that is the signal. You have found the precise gap the lesson closes.

After every lesson: open a Python REPL and probe the concepts. Use dis.dis() on your own functions. Check sys.getrefcount() on objects you create. Read the .pyc files in your project's __pycache__. Reading examples builds familiarity. Running them yourself builds understanding.

The Engineering Standard

This module is written for engineers who want to understand what Python is actually doing when their code runs.

The dis module is how you verify that a refactoring genuinely eliminated unnecessary attribute lookups. tracemalloc is how you find the leak that only manifests after 48 hours of production traffic. The GIL is why adding threads to a CPU-bound task made it slower, not faster. Reference cycles are why that background worker process kept growing in memory until it was killed.

These are not edge cases. They are the class of problems that appear in every sufficiently large Python codebase and that take hours to debug without this knowledge and minutes with it.

Understanding Python at this depth also makes you a better reader of Python source. When you open Django's ORM, SQLAlchemy's session, or asyncio's event loop, you will recognise the patterns: the frame walking, the code object inspection, the import hooks, the reference cycle management. You will read them as mechanisms, not magic.

That is the standard this module is written to.

© 2026 EngineersOfAI. All rights reserved.