Skip to main content

for Loops Internals - The Iterator Protocol and How Python Iteration Really Works

Reading time: ~24 minutes | Level: Foundation → Engineering

Before reading further, predict the output of this code:

data = [10, 20, 30]
it = iter(data)

print(next(it))
print(next(it))

for x in it:
print(x)

for x in it:
print(x, "second loop")

Most developers expect the second for loop to print 10, 20, 30 again. The actual output is:

10
20
30

The second loop prints nothing. There is no "second loop" output. Why? Because iter(data) returns an iterator object that maintains its own position. After the first next(it) and next(it) calls consume 10 and 20, the first for loop exhausts the remaining 30. When the second for loop calls iter(it) - which is what every for loop does internally - it gets back the same exhausted iterator. There are no more items. This reveals something fundamental: a for loop is not a simple repetition construct. It is a consumer of the iterator protocol. Understanding that protocol is what separates surface-level Python users from engineers.

What You Will Learn

  • What for x in iterable: actually executes at the interpreter level
  • The iterator protocol: __iter__ and __next__, what they must return, and why the distinction between an iterable and an iterator matters
  • range() internals: why it is a lazy sequence, not a list, and what that means for memory
  • enumerate(): implementation concept, the start parameter, and why it is always preferred over range(len(...))
  • zip(): lazy evaluation, silent truncation behavior, and zip_longest
  • Iterating over dicts, files, and other built-in iterables
  • reversed() and sorted() in loop context
  • List comprehensions as syntactic sugar: when they improve over explicit loops
  • The else clause on for loops: a preview
  • Implementing a custom iterator class with __iter__ and __next__
  • Generator functions as iterators - the yield mechanism
  • Performance: why for x in list beats for i in range(len(list)): list[i]
  • Pitfalls that cause silent data corruption and hard-to-find bugs

Prerequisites

You should be comfortable with:

  • Python functions, classes, and basic object-oriented concepts
  • if/else branching and basic exceptions
  • Lists, dicts, tuples, and strings as data structures
  • Defining classes with __init__ and instance attributes

What a for Loop Actually Does

When Python executes for x in some_object:, it does not simply access items by index. It performs a two-phase protocol:

Phase 1: Call iter(some_object). This calls some_object.__iter__() and returns an iterator object.

Phase 2: Repeatedly call next(iterator). Each call invokes iterator.__next__(), which returns the next value. When the iterator is exhausted, __next__() raises StopIteration. Python catches this exception and exits the loop body.

This is literally what the Python interpreter does. The for loop is syntactic sugar for this try/except pattern with iter() and next(). You can replicate any for loop manually:

numbers = [1, 2, 3]

# What Python does internally for: for x in numbers: print(x)
_iter = iter(numbers)
while True:
try:
x = next(_iter)
print(x)
except StopIteration:
break

Running both forms produces identical output. The for loop is simply a cleaner syntax for this exact pattern.

The Iterator Protocol: __iter__ and __next__

The iterator protocol has two roles: the iterable and the iterator. Many beginners conflate these.

An iterable is any object that implements __iter__(). Calling iter() on it returns an iterator. Lists, tuples, dicts, sets, strings, and files are all iterables. An iterable can produce fresh iterators multiple times.

An iterator is an object that implements both __iter__() (returning itself) and __next__() (returning the next value or raising StopIteration). An iterator maintains its own position. Once exhausted, it stays exhausted - you cannot reset it.

# A list is an iterable - iter() gives a fresh iterator each time
numbers = [1, 2, 3]
print(type(numbers)) # <class 'list'>
print(type(iter(numbers))) # <class 'list_iterator'>

# The list itself is NOT an iterator - it has __iter__ but not __next__
try:
next(numbers)
except TypeError as e:
print(e) # 'list' object is not an iterator

# But the iterator IS an iterator - and it's also iterable (returns itself)
it = iter(numbers)
print(iter(it) is it) # True - __iter__ on an iterator returns self
print(next(it)) # 1
print(next(it)) # 2
print(next(it)) # 3
try:
next(it)
except StopIteration:
print("Exhausted") # Exhausted

The key implication: when you pass an iterator to a for loop, you resume from where it left off. When you pass an iterable (like a list), the loop calls iter() to create a fresh iterator. This is why in the opening example, the second for loop over the exhausted iterator produced no output.

note

Every iterator must be an iterable (it must implement __iter__ returning self). Not every iterable is an iterator. This asymmetry is intentional: it allows iterables to be used in multiple loops while iterators maintain single-use traversal state.

range() Internals: Lazy Sequences

Many beginners assume range(10) creates a list [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]. It does not. range() returns a range object - a lazy sequence that computes values on demand.

r = range(10)
print(type(r)) # <class 'range'>
print(r) # range(0, 10)

# You can index it, slice it, check membership - all without materializing a list
print(r[5]) # 5
print(5 in r) # True
print(len(r)) # 10

# Memory usage: range(10**9) uses the same memory as range(10)
import sys
print(sys.getsizeof(range(10))) # 48 bytes
print(sys.getsizeof(range(10**9))) # 48 bytes - same!
print(sys.getsizeof(list(range(10)))) # ~184 bytes for 10 elements

The range object stores only three integers: start, stop, and step. From those three values it can compute any element by index as start + index * step. There is no list in memory. This is why range(10**9) works fine - it never allocates a billion integers.

warning

list(range(10**6)) allocates approximately 8 MB of memory. range(10**6) allocates 48 bytes. In performance-sensitive code, never convert a range to a list unless you actually need list operations on the result.

range() accepts up to three arguments: range(start, stop, step). All three can be any integer, including negative steps for counting backward:

for i in range(10, 0, -2):
print(i) # 10, 8, 6, 4, 2

enumerate(): The Right Way to Track Index and Value

A common pattern in code written by developers coming from other languages:

data = ["alpha", "beta", "gamma"]

# Anti-pattern: range(len(...))
for i in range(len(data)):
print(f"Index {i}: {data[i]}")

This works but is un-Pythonic and fragile. It requires the container to support indexing (generators do not), it calls len() on every iteration argument, and the intent is obscured. The Pythonic alternative is enumerate():

for i, value in enumerate(data):
print(f"Index {i}: {value}")

enumerate() wraps any iterable and yields (index, value) pairs. It works with any iterable - including generators and file objects - not just sequences. The start parameter sets the starting index:

for i, value in enumerate(data, start=1):
print(f"Item {i}: {value}")
# Item 1: alpha
# Item 2: beta
# Item 3: gamma

Conceptually, enumerate() is implemented as a generator:

def my_enumerate(iterable, start=0):
index = start
for item in iterable:
yield index, item
index += 1
tip

Use enumerate() whenever you need the index alongside the value. Use range(len(...)) only when you need the index without the value, or when you need to iterate over multiple sequences by the same index simultaneously (though zip() is usually better for that too).

zip(): Iterating Multiple Sequences in Parallel

zip() takes multiple iterables and yields tuples of corresponding elements:

names = ["alice", "bob", "carol"]
scores = [95, 87, 92]

for name, score in zip(names, scores):
print(f"{name}: {score}")
# alice: 95
# bob: 87
# carol: 92

zip() is lazy: it creates a zip object that yields one tuple per iteration without creating intermediate lists. Like range(), it is memory-efficient regardless of input size.

Critical behavior: zip() silently truncates at the shortest iterable.

a = [1, 2, 3, 4, 5]
b = ["x", "y"]

for pair in zip(a, b):
print(pair)
# (1, 'x')
# (2, 'y')
# Items 3, 4, 5 from 'a' are silently ignored

This silent truncation is a common source of bugs when the lengths are expected to match. To get all pairs and fill missing values, use itertools.zip_longest:

from itertools import zip_longest

for pair in zip_longest(a, b, fillvalue=None):
print(pair)
# (1, 'x')
# (2, 'y')
# (3, None)
# (4, None)
# (5, None)

You can also use zip() to unzip: given a list of pairs, zip(*pairs) transposes it:

pairs = [(1, "a"), (2, "b"), (3, "c")]
numbers, letters = zip(*pairs)
print(numbers) # (1, 2, 3)
print(letters) # ('a', 'b', 'c')

Iterating Over Dicts

Python dicts are iterable. Iterating directly yields keys:

config = {"host": "localhost", "port": 8080, "debug": True}

for key in config:
print(key) # host, port, debug

For keys and values together, use .items():

for key, value in config.items():
print(f"{key} = {value}")

For only values, use .values(). For only keys (explicitly), use .keys(), though iterating the dict directly is equivalent.

Dict ordering guarantee (Python 3.7+): Dicts maintain insertion order. Iterating a dict always yields keys in the order they were inserted. This is a language specification guarantee, not an implementation detail - it is safe to rely on in all Python 3.7+ code.

warning

Never modify a dict while iterating over it. This raises RuntimeError: dictionary changed size during iteration. If you need to add or remove keys during iteration, collect the changes in a separate dict or list and apply them after the loop completes.

Iterating Over Files

A file object in Python is itself an iterator - not just an iterable, but a stateful iterator. Each call to next() on a file returns the next line. This makes iterating over large files extremely memory-efficient:

with open("data.txt") as f:
for line in f:
line = line.rstrip("\n")
process(line)

This reads and processes one line at a time. The entire file is never loaded into memory. For a 10 GB log file, this is the only viable approach. Calling f.readlines() would allocate all 10 GB at once.

Because a file is an iterator (not just an iterable), once you have read to the end, subsequent for loops over the same file object yield nothing. To restart, call f.seek(0) to rewind, or reopen the file.

reversed() and sorted() in Loops

reversed() returns a reverse iterator over any sequence that supports __reversed__ or __len__ and __getitem__. For lists, this is efficient: it does not create a reversed copy.

data = [1, 2, 3, 4, 5]

for x in reversed(data):
print(x) # 5, 4, 3, 2, 1

sorted() always returns a new sorted list. It accepts any iterable, including generators:

data = {3, 1, 4, 1, 5, 9}

for x in sorted(data):
print(x) # 1, 3, 4, 5, 9

For sorting with a key function:

words = ["banana", "apple", "cherry"]
for word in sorted(words, key=len):
print(word) # apple, banana, cherry

List Comprehensions: Syntactic Sugar for For Loops

A list comprehension is syntactic sugar for a for loop that builds a list:

# Explicit for loop
squares = []
for x in range(10):
if x % 2 == 0:
squares.append(x ** 2)

# Equivalent list comprehension
squares = [x ** 2 for x in range(10) if x % 2 == 0]

Both produce the same result. The comprehension is not just shorter - it is also faster. CPython optimizes list comprehensions with a dedicated LIST_APPEND bytecode instruction rather than the general-purpose LOAD_ATTR + CALL_FUNCTION sequence for .append(). For large lists, comprehensions are measurably faster.

Use a comprehension when: the transformation is a single expression per element, the result is a list that will be used further, and the logic is simple enough to read in one line.

Use an explicit for loop when: the body is complex (multiple statements, early continues, exception handling), you are not building a list, or readability suffers from compression.

Generator expressions (using () instead of []) produce a lazy iterator instead of a list, consuming no additional memory:

# Generator expression - lazy, no list allocated
total = sum(x ** 2 for x in range(10**6))

The else Clause on for Loops

Python's for loop has an else clause that runs when the loop completes without hitting a break:

for item in data:
if condition(item):
result = item
break
else:
result = None # No break occurred - item not found

This is covered in depth in the next lesson. For now: the else runs on natural loop completion. If break fires, else does not run.

Implementing a Custom Iterator

Any class can be made iterable by implementing __iter__ and __next__. This is the same protocol that list, range, and files implement internally.

class Countdown:
"""An iterator that counts down from n to 1."""

def __init__(self, start):
self.current = start

def __iter__(self):
# Return self because this object IS the iterator
return self

def __next__(self):
if self.current <= 0:
raise StopIteration
value = self.current
self.current -= 1
return value

# Usage
for n in Countdown(5):
print(n)
# 5, 4, 3, 2, 1

# Manual protocol usage
cd = Countdown(3)
print(next(cd)) # 3
print(next(cd)) # 2
print(next(cd)) # 1
try:
next(cd)
except StopIteration:
print("Done") # Done

Notice that Countdown implements both __iter__ and __next__, so it is simultaneously an iterable and an iterator. This is the standard pattern for custom iterators. The alternative is a two-class design: an iterable class whose __iter__ creates and returns a separate stateful iterator instance. The two-class design allows multiple independent iterators over the same data (like a list), while the single-class design produces an iterator that can only be traversed once.

Generator Functions as Iterators

Writing __iter__ and __next__ manually is verbose. Python's yield keyword provides a concise alternative. A function containing yield is a generator function; calling it returns a generator object that implements the iterator protocol automatically:

def countdown(start):
while start > 0:
yield start
start -= 1

for n in countdown(5):
print(n)
# 5, 4, 3, 2, 1

When countdown(5) is called, no code executes yet. The generator is created in a suspended state. Each next() call resumes execution until the next yield, returns the yielded value, and suspends again. When the function reaches its end (or return), StopIteration is raised automatically.

Generators are lazy: they produce values on demand without storing them all in memory. A generator that produces 10 million values uses the same memory as one that produces 10.

def integers_from(n):
"""Infinite generator - produces integers starting from n."""
while True:
yield n
n += 1

gen = integers_from(1)
for _ in range(5):
print(next(gen)) # 1, 2, 3, 4, 5

# The generator is still alive and can produce more values
print(next(gen)) # 6

Performance: Iteration vs Indexing

Direct iteration (for x in lst) is faster than index-based access (for i in range(len(lst)): lst[i]). Here is why.

Index-based access requires: loading range, calling len(), creating a range object, calling __next__ on the range iterator to get i, then calling __getitem__ on the list with i. That is two levels of iterator protocol plus an index lookup per iteration.

Direct iteration requires: calling __next__ on the list iterator directly. The list iterator maintains an internal C-level index and increments it without the overhead of index lookup through Python's attribute system.

For most code, the difference is negligible. In tight inner loops processing millions of items, it can matter. The idiomatic pattern is always direct iteration unless you have a specific reason to need the index.

import timeit

data = list(range(10000))

# Direct iteration
t1 = timeit.timeit(
"for x in data: pass",
globals={"data": data},
number=10000
)

# Index-based
t2 = timeit.timeit(
"for i in range(len(data)): data[i]",
globals={"data": data},
number=10000
)

print(f"Direct: {t1:.3f}s")
print(f"Indexed: {t2:.3f}s")
# Direct iteration is typically 20-40% faster

Pitfalls

Pitfall 1: Mutating a list while iterating over it.

This produces undefined behavior - elements can be skipped, repeated, or cause IndexError:

items = [1, 2, 3, 4, 5]

# WRONG: removing elements while iterating
for item in items:
if item % 2 == 0:
items.remove(item) # Modifies the list mid-iteration

print(items) # [1, 3, 5] - looks correct, but item 2 is skipped internally

The list iterator uses an internal index. When you remove 2 (index 1), the element at index 2 (3) slides to index 1. The iterator then moves to index 2, which is now 4 - 3 was never seen. Fix: build a new list or iterate over a copy:

# CORRECT: filter to a new list
items = [x for x in items if x % 2 != 0]

# OR: iterate over a copy
for item in items.copy():
if item % 2 == 0:
items.remove(item)

Pitfall 2: range(len(x)) anti-pattern.

# Anti-pattern
for i in range(len(names)):
print(names[i])

# Pythonic
for name in names:
print(name)

# Need index too? Use enumerate, not range(len)
for i, name in enumerate(names):
print(i, name)

Pitfall 3: zip() silently truncating.

expected_pairs = 5
a = list(range(5))
b = list(range(3)) # Shorter by mistake

# zip silently produces only 3 pairs - no warning, no error
for x, y in zip(a, b):
print(x, y)

# If lengths must match, assert before zipping:
assert len(a) == len(b), f"Length mismatch: {len(a)} vs {len(b)}"

Pitfall 4: Assuming a generator can be replayed.

Generators and other iterators are single-use. If you need to iterate multiple times, either store results in a list or recreate the generator.

Interview Questions and Detailed Answers

Q1: Explain the Python iterator protocol. What methods must an iterator implement?

The iterator protocol consists of two methods. __iter__() must return the iterator object itself. __next__() must return the next value in the sequence or raise StopIteration when the sequence is exhausted. An iterable (as opposed to an iterator) only needs __iter__() - it can return a fresh iterator object each time, allowing the iterable to be traversed multiple times. The for loop calls iter() on whatever it is given (triggering __iter__), then calls next() on the resulting iterator (triggering __next__) in a loop, catching StopIteration to terminate.

Q2: What does StopIteration signal and how does the for loop handle it?

StopIteration is the sentinel exception that signals an iterator has no more values to produce. The for loop internally wraps its next() calls in a try/except StopIteration block. When StopIteration is raised, the loop exits cleanly - it is not an error condition, it is the normal termination signal. In generator functions, a return statement (with or without a value) causes the generator to raise StopIteration automatically. From Python 3.7 onward (PEP 479), if a StopIteration propagates out of a generator body, it is converted to a RuntimeError - this prevents subtle bugs where an inner iterator's exhaustion accidentally terminated an outer generator.

Q3: Why is range(10**9) memory-efficient while list(range(10**9)) is not?

range() is a lazy sequence: it stores only three integers (start, stop, step) and computes any element on demand using integer arithmetic: element = start + index * step. No matter how large the range, it occupies a fixed 48 bytes. list(range(10**9)) materializes all one billion integers into a Python list, requiring approximately 8 GB of memory (each Python integer object is 28 bytes, and the list holds a pointer to each). The range object supports all sequence operations - indexing, slicing, len(), in membership testing - through direct computation, making it fully functional without ever allocating the integers.

Q4: Why should you prefer enumerate() over range(len(...))?

enumerate() is preferred for three reasons. First, it works with any iterable, not just sequences - you can enumerate() a generator or file object, neither of which supports len() or indexing. Second, it is more readable: for i, value in enumerate(items) directly expresses the intent (iterate with index), while for i in range(len(items)): items[i] requires a reader to parse two operations. Third, it is slightly faster in CPython because it avoids the overhead of computing the range length, creating a range object, and doing list indexing on each iteration. The start parameter also makes 1-based indexing trivial: enumerate(items, start=1).

Q5: What is zip()'s behavior when iterables have different lengths?

zip() terminates when the shortest iterable is exhausted. Extra elements from longer iterables are silently discarded with no warning or error. This is intentional for cases like pairing keys and values that are known to match in length. When lengths may differ and you need all elements, itertools.zip_longest fills missing values with a fillvalue (default None). If you expect lengths to match and a mismatch indicates a bug, assert equal lengths before zipping. The silent truncation is one of the most common sources of subtle bugs when working with parallel sequences.

Q6: How do you implement a custom iterator class? What is the difference between making a class an iterable vs making it an iterator?

A class becomes an iterable by implementing __iter__() that returns an iterator object (possibly a separate class instance). A class becomes an iterator by implementing both __iter__() (returning self) and __next__() (returning the next value or raising StopIteration). The distinction matters for reusability: an iterable can create fresh iterators each time it is passed to a for loop, so it can be iterated multiple times. An iterator maintains a single traversal position, so it can only be used once. Lists are iterables; list_iterator objects are iterators. For custom classes, use the two-class design (separate iterable and iterator) when multiple simultaneous traversals must be supported. Use the single-class design (iterable that is also its own iterator) for single-use data sources like files or network streams.

Graded Practice Challenges

Level 1 - Predict the Output

a = [1, 2, 3]
b = [4, 5]
c = [6, 7, 8, 9]

for x, y, z in zip(a, b, c):
print(x + y + z)

print("---")

for i, val in enumerate("ABC", start=10):
print(i, val)

print("---")

r = range(2, 10, 3)
print(list(r))
print(len(r))
print(7 in r)

What is the complete output? Trace without running.

Show Answer
11
14
---
10 A
11 B
12 C
---
[2, 5, 8]
3
False

zip(a, b, c): The shortest iterable is b with 2 elements, so only 2 tuples are produced: (1, 4, 6) and (2, 5, 7). Sums: 1+4+6 = 11, 2+5+7 = 14.

enumerate("ABC", start=10): Yields (10, 'A'), (11, 'B'), (12, 'C').

range(2, 10, 3): Generates 2, 5, 8 (2 + 03=2, 2 + 13=5, 2 + 23=8; 2 + 33=11 exceeds 10). Length is 3. 7 in r is False - 7 is not generated by this range (2, 5, 8 are the only members).

Level 2 - Debug the Iterator Bug

This function is supposed to process all names in both lists. Find and fix the bug.

def process_all_names(primary_names, fallback_names):
all_iter = iter(primary_names + fallback_names)
results = []

for name in all_iter:
results.append(name.upper())

# Process again with a prefix
for name in all_iter:
results.append("PREFIX_" + name)

return results

names_a = ["alice", "bob"]
names_b = ["carol", "dave"]
output = process_all_names(names_a, names_b)
print(len(output)) # Expected: 8. Actual: ?
Show Answer

The bug: all_iter is created once from iter(primary_names + fallback_names). The first for loop exhausts all 4 names, so all_iter is at the end. When the second for loop starts, it calls iter(all_iter) which returns the same exhausted iterator - there are no more elements, so the second loop produces nothing. len(output) is 4, not 8.

Fix option 1: Recreate the iterator for the second loop.

def process_all_names(primary_names, fallback_names):
combined = primary_names + fallback_names
results = []

for name in combined:
results.append(name.upper())

for name in combined:
results.append("PREFIX_" + name)

return results

Fix option 2: If a single-pass approach is required, process both transformations in one loop.

def process_all_names(primary_names, fallback_names):
combined = primary_names + fallback_names
results = []
for name in combined:
results.append(name.upper())
results.append("PREFIX_" + name)
return results

The root cause is confusing an iterator (single-use, stateful) with an iterable (reusable). Always iterate over the iterable (combined) rather than an iterator (all_iter) if multiple passes are needed.

Level 3 - Implement a Custom Cycle Iterator

Implement a Cycle class that takes an iterable and loops through it forever. Cycle([1, 2, 3]) should yield 1, 2, 3, 1, 2, 3, 1, 2, 3, ... indefinitely. Requirements:

  1. Implement __iter__ and __next__ (no itertools.cycle allowed).
  2. Handle empty iterables gracefully (raise StopIteration immediately).
  3. Write a take(n) function that returns the first n items from any iterator.
  4. Demonstrate it works with a non-list iterable (e.g., a string).
Show Answer
class Cycle:
"""
An iterator that cycles through an iterable forever.
The iterable is fully consumed once on construction and stored internally.
Subsequent cycles replay from the stored copy.
"""

def __init__(self, iterable):
# Consume the iterable once and store as a list.
# We need to replay it, so we must materialize it.
self._data = list(iterable)
if not self._data:
# An empty iterable would cycle nothing - treat as exhausted
self._exhausted = True
else:
self._exhausted = False
self._index = 0

def __iter__(self):
return self

def __next__(self):
if self._exhausted:
raise StopIteration
value = self._data[self._index]
self._index = (self._index + 1) % len(self._data)
return value


def take(n, iterator):
"""Return a list of the first n items from iterator."""
if n < 0:
raise ValueError(f"n must be non-negative, got {n}")
result = []
for _ in range(n):
try:
result.append(next(iterator))
except StopIteration:
break
return result


# Test with a list
c1 = Cycle([1, 2, 3])
print(take(8, c1)) # [1, 2, 3, 1, 2, 3, 1, 2]

# Test with a string (non-list iterable)
c2 = Cycle("AB")
print(take(6, c2)) # ['A', 'B', 'A', 'B', 'A', 'B']

# Test with empty iterable
c3 = Cycle([])
print(take(5, c3)) # []

# Test that it is an iterator (iter returns self)
c4 = Cycle([10, 20])
print(iter(c4) is c4) # True

# Test resuming from where it left off
print(next(c4)) # 10
print(next(c4)) # 20
print(next(c4)) # 10 - wrapped around
for val in take(3, c4):
print(val) # 20, 10, 20

Key design decisions: the input iterable is fully consumed once during __init__ and stored as a list. This is necessary because we need to replay it. If the input is already a list, list() creates a copy (avoiding modification issues). The modulo arithmetic (self._index + 1) % len(self._data) wraps the index back to 0 when it reaches the end, producing the infinite cycle without any conditional logic.

Quick Reference Cheatsheet

ConceptKey DetailExample
for x in objCalls iter(obj), then next() in a loop-
__iter__()Must return an iterator (self if iterator)return self
__next__()Returns next value or raises StopIteration-
iter(x)Returns iterator from iterableit = iter([1,2,3])
next(it)Returns next item from iteratornext(it)1
range(n)Lazy sequence, not a list - 48 bytes alwaysrange(10**9) is fine
enumerate(x, start=0)Yields (index, value) pairsfor i, v in enumerate(lst)
zip(a, b)Yields pairs, stops at shortestSilent truncation!
zip_longest(a, b)Yields pairs, fills shorter with fillvaluefrom itertools import zip_longest
.items()Iterate dict as (key, value) pairsfor k, v in d.items()
File iterationFile IS an iterator - reads line by linefor line in f:
Generator functionyield creates lazy iteratordef gen(): yield 1
List comprehensionSyntactic sugar for loop + append, faster[x**2 for x in range(10)]
Generator expressionLazy version of comprehensionsum(x**2 for x in range(10))
Mutating while iteratingUndefined behavior - skip/repeat elementsIterate over copy instead

Key Takeaways

  • A for loop in Python is not a counter loop. It is an iterator protocol engine: it calls iter() once to get an iterator, then calls next() repeatedly until StopIteration is raised.
  • An iterable produces fresh iterators each time via __iter__. An iterator is stateful, single-use, and must return self from its own __iter__.
  • range() is a lazy sequence object that stores only three integers and computes values on demand - range(10**9) uses 48 bytes, not gigabytes.
  • Prefer enumerate() over range(len(...)): it works with any iterable, is more readable, and is faster.
  • zip() silently truncates at the shortest iterable - this is a common source of bugs. Use zip_longest when lengths may differ, or assert equal lengths when they must match.
  • Dict iteration is insertion-ordered in Python 3.7+ - this is a language guarantee, not an implementation detail.
  • File objects are iterators, not just iterables: they read line by line on demand, making them memory-efficient for arbitrarily large files.
  • Any class implementing __iter__ and __next__ participates in the iterator protocol and works with for loops, next(), zip(), enumerate(), and all other iterator consumers.
  • Generator functions with yield are the concise alternative to manual iterator classes - they implement the full protocol automatically.
  • Never mutate a list while iterating over it; iterate over a copy or build a new list using a comprehension.
© 2026 EngineersOfAI. All rights reserved.