Python The Iterator Protocol — How: Practice Problems & Exercises
Practice: The Iterator Protocol — How Python's for Loop Really Works
← Back to lessonEasy
This problem is observational — run the manual desugaring of a for loop and confirm it produces the same output as for value in data: print(value).
data = [10, 20, 30, 40, 50]
it = iter(data)
while True:
try:
value = next(it)
print(value)
except StopIteration:
breakSolution
data = [10, 20, 30, 40, 50]
it = iter(data)
while True:
try:
value = next(it)
print(value)
except StopIteration:
break
Explanation: This is the exact bytecode emitted by a for loop. Python calls iter(data) once to get the iterator, then calls next(it) on each iteration. When StopIteration is raised, the for loop exits. Understanding this desugaring explains why you can't restart a for loop over an iterator — the iterator is exhausted and StopIteration is raised immediately.
# Manually iterate over a list using iter() and next()
# without using a for loop.
data = [10, 20, 30, 40, 50]
it = iter(data)
while True:
try:
value = next(it)
print(value)
except StopIteration:
breakExpected Output
10\n20\n30\n40\n50Hints
Hint 1: iter(obj) calls obj.__iter__() and returns the iterator.
Hint 2: next(it) calls it.__next__() and raises StopIteration when exhausted.
Hint 3: This is exactly what a for loop does under the hood.
Run the code to confirm that next(it, default) returns the default value instead of raising StopIteration when the iterator is exhausted.
data = [1, 2, 3] it = iter(data) print(next(it, 'EMPTY')) print(next(it, 'EMPTY')) print(next(it, 'EMPTY')) print(next(it, 'EMPTY')) print(next(it, 'EMPTY'))
Solution
data = [1, 2, 3]
it = iter(data)
print(next(it, 'EMPTY')) # 1
print(next(it, 'EMPTY')) # 2
print(next(it, 'EMPTY')) # 3
print(next(it, 'EMPTY')) # EMPTY
print(next(it, 'EMPTY')) # EMPTY
Explanation: The two-argument form next(it, default) is equivalent to a try/except StopIteration that returns default. It is commonly used when you want the "first matching element or None" pattern: next((x for x in items if condition), None).
data = [1, 2, 3]
it = iter(data)
print(next(it, 'EMPTY')) # 1
print(next(it, 'EMPTY')) # 2
print(next(it, 'EMPTY')) # 3
print(next(it, 'EMPTY')) # EMPTY (exhausted — no exception)
print(next(it, 'EMPTY')) # EMPTY (still no exception)Expected Output
1\n2\n3\nEMPTY\nEMPTYHints
Hint 1: next(iterator, default) returns default instead of raising StopIteration.
Hint 2: This is the safe form — useful when you want optional elements.
Run the code and observe that (a) two calls to iter(lst) produce independent iterators, and (b) iter(iterator) is iterator.
lst = [1, 2, 3] it1 = iter(lst) it2 = iter(lst) print(next(it1)) # 1 print(next(it1)) # 2 print(next(it2)) # 1 print(iter(it1) is it1) # True
Solution
lst = [1, 2, 3]
it1 = iter(lst)
it2 = iter(lst)
print(next(it1)) # 1
print(next(it1)) # 2
print(next(it2)) # 1 — independent
print(iter(it1) is it1) # True
Explanation: An iterable implements __iter__() and produces a fresh iterator each call. A list is an iterable but not an iterator — it has no __next__. An iterator implements both __iter__() (returning self) and __next__(). This distinction matters: you can loop over a list multiple times, but a generator (an iterator) is one-shot.
# A list is an ITERABLE (has __iter__) but NOT an iterator (no __next__)
# iter(list) produces a fresh iterator each time
lst = [1, 2, 3]
it1 = iter(lst)
it2 = iter(lst) # independent iterator
print(next(it1)) # 1
print(next(it1)) # 2
print(next(it2)) # should be 1, not 3
# An iterator IS its own iterable — iter(iterator) returns itself
print(iter(it1) is it1) # TrueExpected Output
1\n2\n1\nTrueHints
Hint 1: Lists are iterables — they produce a new iterator each time iter() is called.
Hint 2: Iterators are stateful — two calls to iter(list) give two independent iterators.
Hint 3: Calling iter() on an iterator returns the same object (it is its own iterator).
Use the two-argument iter(callable, sentinel) form to collect all die rolls until (but not including) the first roll of 6.
import random random.seed(42) rolls = list(iter(lambda: random.randint(1, 6), 6)) print(rolls)
Solution
import random
random.seed(42)
rolls = list(iter(lambda: random.randint(1, 6), 6))
print(rolls)
Explanation: iter(callable, sentinel) creates an iterator that calls callable() on each next() and stops (raises StopIteration) when the returned value equals sentinel. This form is useful for reading chunks from a file (iter(lambda: f.read(8192), b'')) or any "read until end marker" pattern without writing a custom loop.
# iter(callable, sentinel) calls callable() repeatedly
# and stops when the return value equals sentinel.
import random
random.seed(42)
# Simulate rolling a die until we get a 6
rolls = list(iter(lambda: random.randint(1, 6), 6))
print(rolls) # all values before the first 6Expected Output
[2, 5, 4, 4, 1, 5, 5, 5, 1, 4, 5, 3, 5, 3]Hints
Hint 1: iter(callable, sentinel) is the two-argument form of iter().
Hint 2: The callable is called with no arguments on each iteration.
Hint 3: Iteration stops when callable() returns a value equal to sentinel.
Medium
Implement MyRange as a class-based iterator with __iter__ and __next__. It should behave like range(start, stop, step) for positive steps.
class MyRange:
def __init__(self, start, stop, step=1):
self._current = start
self._stop = stop
self._step = step
def __iter__(self):
return self
def __next__(self):
if self._current >= self._stop:
raise StopIteration
value = self._current
self._current += self._step
return value
for n in MyRange(0, 10, 2):
print(n, end=' ')
print()
r = MyRange(1, 4)
it = iter(r)
print(next(it), next(it), next(it))Solution
class MyRange:
def __init__(self, start, stop, step=1):
self._current = start
self._stop = stop
self._step = step
def __iter__(self):
return self
def __next__(self):
if self._current >= self._stop:
raise StopIteration
value = self._current
self._current += self._step
return value
for n in MyRange(0, 10, 2):
print(n, end=' ')
print()
r = MyRange(1, 4)
it = iter(r)
print(next(it), next(it), next(it))
Explanation: The iterator protocol requires __iter__ (returning self) and __next__ (returning the next value or raising StopIteration). This design choice — the iterator is its own iterable — means a class-based iterator is single-use: once exhausted, you need a new instance to iterate again. Separate iterable + iterator classes allow multiple independent iterations.
class MyRange:
"""A custom iterator mimicking range(start, stop, step)."""
def __init__(self, start, stop, step=1):
pass
def __iter__(self):
pass
def __next__(self):
pass
for n in MyRange(0, 10, 2):
print(n, end=' ')
print()
# Should also support iter() returning self
r = MyRange(1, 4)
it = iter(r)
print(next(it), next(it), next(it))Expected Output
0 2 4 6 8 \n1 2 3Hints
Hint 1: __iter__ should return self (the iterator IS its own iterable).
Hint 2: __next__ should return the current value and advance the internal pointer.
Hint 3: Raise StopIteration when the current value is out of range.
Hint 4: Store current, stop, and step as instance attributes in __init__.
Implement a separate NumberSequence (iterable) and NumberIterator (iterator) so the sequence can be iterated multiple times independently.
class NumberSequence:
def __init__(self, data):
self._data = data
def __iter__(self):
return NumberIterator(self._data)
class NumberIterator:
def __init__(self, data):
self._data = data
self._index = 0
def __iter__(self):
return self
def __next__(self):
if self._index >= len(self._data):
raise StopIteration
value = self._data[self._index]
self._index += 1
return value
seq = NumberSequence([10, 20, 30])
print(list(seq))
print(list(seq))Solution
class NumberSequence:
def __init__(self, data):
self._data = data
def __iter__(self):
return NumberIterator(self._data)
class NumberIterator:
def __init__(self, data):
self._data = data
self._index = 0
def __iter__(self):
return self
def __next__(self):
if self._index >= len(self._data):
raise StopIteration
value = self._data[self._index]
self._index += 1
return value
seq = NumberSequence([10, 20, 30])
print(list(seq)) # [10, 20, 30]
print(list(seq)) # [10, 20, 30]
Explanation: Returning a fresh NumberIterator from NumberSequence.__iter__ is the key difference from the previous problem. Each for loop (or list() call) calls __iter__ on seq, creating a new NumberIterator starting at index 0. Lists, tuples, and sets all use this pattern — they are iterables, not iterators.
class NumberSequence:
"""An iterable (not an iterator) that can be looped multiple times."""
def __init__(self, data):
self._data = data
def __iter__(self):
# Return a NEW iterator each time
pass
class NumberIterator:
"""The actual iterator — stateful, single-use."""
def __init__(self, data):
self._data = data
self._index = 0
def __iter__(self):
pass
def __next__(self):
pass
seq = NumberSequence([10, 20, 30])
print(list(seq)) # [10, 20, 30]
print(list(seq)) # [10, 20, 30] <- works again, unlike a generatorExpected Output
[10, 20, 30]\n[10, 20, 30]Hints
Hint 1: NumberSequence.__iter__ should return NumberIterator(self._data) — a fresh one each call.
Hint 2: NumberIterator.__iter__ returns self.
Hint 3: NumberIterator.__next__ advances _index and raises StopIteration at the end.
Use itertools.chain to merge three different sequence types lazily, then use chain.from_iterable to flatten a list of lists.
import itertools a = [1, 2, 3] b = (4, 5, 6) c = range(7, 10) combined = itertools.chain(a, b, c) print(list(combined)) nested = [[1, 2], [3, 4], [5, 6]] flat = list(itertools.chain.from_iterable(nested)) print(flat)
Solution
import itertools
a = [1, 2, 3]
b = (4, 5, 6)
c = range(7, 10)
combined = itertools.chain(a, b, c)
print(list(combined)) # [1, 2, 3, 4, 5, 6, 7, 8, 9]
nested = [[1, 2], [3, 4], [5, 6]]
flat = list(itertools.chain.from_iterable(nested))
print(flat) # [1, 2, 3, 4, 5, 6]
Explanation: itertools.chain advances through each iterable in turn — when one is exhausted, it moves to the next. It works with any combination of types (list, tuple, range, generator) because all are iterables. chain.from_iterable accepts a single iterable of iterables, making it ideal for flattening without reduce.
import itertools
a = [1, 2, 3]
b = (4, 5, 6)
c = range(7, 10)
# Combine a, b, c into one lazy sequence without creating a new list
combined = itertools.chain(a, b, c)
print(list(combined))
# Also show chain.from_iterable for a list of iterables
nested = [[1, 2], [3, 4], [5, 6]]
flat = list(itertools.chain.from_iterable(nested))
print(flat)Expected Output
[1, 2, 3, 4, 5, 6, 7, 8, 9]\n[1, 2, 3, 4, 5, 6]Hints
Hint 1: itertools.chain(*iterables) returns a lazy iterator over them in sequence.
Hint 2: chain.from_iterable(iterable_of_iterables) is the flattening form.
Hint 3: Neither creates an intermediate list.
Use itertools.groupby to group words by their first letter. Note that groupby requires the input to be sorted by the grouping key.
import itertools
words = ['apple', 'avocado', 'banana', 'blueberry', 'cherry', 'cranberry']
for letter, group in itertools.groupby(words, key=lambda w: w[0]):
print(letter, list(group))Solution
import itertools
words = ['apple', 'avocado', 'banana', 'blueberry', 'cherry', 'cranberry']
for letter, group in itertools.groupby(words, key=lambda w: w[0]):
print(letter, list(group))
Explanation: groupby produces a new group whenever the key function returns a different value. It does NOT sort — it only groups consecutive runs. If words were ['apple', 'banana', 'avocado'], you would get three groups: a, b, a. The group value is a lazy sub-iterator; always materialise it with list() before advancing to the next (letter, group) pair.
import itertools
# Group a list of words by their first letter
words = ['apple', 'avocado', 'banana', 'blueberry', 'cherry', 'cranberry']
# words is already sorted by first letter
for letter, group in itertools.groupby(words, key=lambda w: w[0]):
print(letter, list(group))Expected Output
a ['apple', 'avocado']\nb ['banana', 'blueberry']\nc ['cherry', 'cranberry']Hints
Hint 1: groupby yields (key, group_iterator) pairs.
Hint 2: IMPORTANT: groupby only groups consecutive equal keys — the input must be sorted by key first.
Hint 3: Materialise the group with list() before the next iteration, or it will be consumed.
Hard
Implement cyclic_sequence(items, start_at) that returns an infinite iterator cycling through items starting from the element matching start_at.
import itertools
def cyclic_sequence(items, start_at):
if start_at not in items:
raise ValueError(f"{start_at!r} not in items")
start_index = items.index(start_at)
# Reorder so we begin at start_index
reordered = items[start_index:] + items[:start_index]
return itertools.cycle(reordered)
seq = cyclic_sequence(['A', 'B', 'C', 'D'], start_at='C')
print(list(itertools.islice(seq, 7)))Solution
import itertools
def cyclic_sequence(items, start_at):
if start_at not in items:
raise ValueError(f"{start_at!r} not in items")
start_index = items.index(start_at)
reordered = items[start_index:] + items[:start_index]
return itertools.cycle(reordered)
seq = cyclic_sequence(['A', 'B', 'C', 'D'], start_at='C')
print(list(itertools.islice(seq, 7)))
# ['C', 'D', 'A', 'B', 'C', 'D', 'A']
Explanation: Rotating the list so start_at is first, then passing it to itertools.cycle, is the cleanest approach. An alternative is itertools.dropwhile(lambda x: x != start_at, itertools.cycle(items)) which scans to the first occurrence of start_at without modifying the list — but it advances the cycle iterator in place, so the rest of the cycle is correct from that point onward.
import itertools
def cyclic_sequence(items, start_at):
# Return an infinite iterator that cycles through items,
# starting from the item equal to start_at.
# If start_at is not found, raise ValueError.
pass
seq = cyclic_sequence(['A', 'B', 'C', 'D'], start_at='C')
print(list(itertools.islice(seq, 7)))Expected Output
['C', 'D', 'A', 'B', 'C', 'D', 'A']Hints
Hint 1: itertools.cycle(items) creates an infinite cycling iterator.
Hint 2: You need to advance past all items before start_at.
Hint 3: itertools.dropwhile(pred, it) skips items while pred(item) is True.
Hint 4: One approach: find the index of start_at, then islice(cycle(items), index, None) to start there.
Implement PeekIterator — a wrapper around any iterator that adds peek() and has_next() without consuming elements.
class PeekIterator:
_EXHAUSTED = object()
def __init__(self, iterable):
self._it = iter(iterable)
self._next_val = next(self._it, self._EXHAUSTED)
def __iter__(self):
return self
def __next__(self):
if self._next_val is self._EXHAUSTED:
raise StopIteration
value = self._next_val
self._next_val = next(self._it, self._EXHAUSTED)
return value
def peek(self, default=None):
if self._next_val is self._EXHAUSTED:
return default
return self._next_val
def has_next(self):
return self._next_val is not self._EXHAUSTED
it = PeekIterator([10, 20, 30])
print(it.peek())
print(next(it))
print(it.peek())
print(it.has_next())
print(next(it))
print(next(it))
print(it.has_next())
print(it.peek(-1))Solution
class PeekIterator:
_EXHAUSTED = object()
def __init__(self, iterable):
self._it = iter(iterable)
self._next_val = next(self._it, self._EXHAUSTED)
def __iter__(self):
return self
def __next__(self):
if self._next_val is self._EXHAUSTED:
raise StopIteration
value = self._next_val
self._next_val = next(self._it, self._EXHAUSTED)
return value
def peek(self, default=None):
if self._next_val is self._EXHAUSTED:
return default
return self._next_val
def has_next(self):
return self._next_val is not self._EXHAUSTED
Explanation: The "one-element lookahead buffer" pattern stores the next value eagerly. The private sentinel _EXHAUSTED = object() is an identity-checkable unique object that cannot appear as a real data value — safer than using None (which could be a legitimate value). next(it, self._EXHAUSTED) returns the sentinel instead of raising StopIteration. more-itertools.peekable implements the same pattern with richer features.
class PeekIterator:
"""
Wraps any iterator to add a peek() method that returns the next
value WITHOUT advancing the iterator. Returns a sentinel if exhausted.
"""
_EXHAUSTED = object()
def __init__(self, iterable):
pass
def __iter__(self):
pass
def __next__(self):
pass
def peek(self, default=None):
"""Return the next value without consuming it. Return default if exhausted."""
pass
def has_next(self):
"""Return True if there are more elements."""
pass
it = PeekIterator([10, 20, 30])
print(it.peek()) # 10 — not consumed
print(next(it)) # 10 — now consumed
print(it.peek()) # 20
print(it.has_next()) # True
print(next(it)) # 20
print(next(it)) # 30
print(it.has_next()) # False
print(it.peek(-1)) # -1 (exhausted, returns default)Expected Output
10\n10\n20\nTrue\n20\n30\nFalse\n-1Hints
Hint 1: Store the underlying iterator and a "buffered" value separately.
Hint 2: In __init__, eagerly fetch the first element into self._next_val.
Hint 3: peek() returns self._next_val (or default if _EXHAUSTED).
Hint 4: __next__ returns the buffered value and fetches the next one to buffer.
Implement merge_sorted(*iterables) that lazily merges multiple pre-sorted iterators into one sorted sequence. Use heapq.merge or implement a min-heap manually.
import heapq
def merge_sorted(*iterables):
yield from heapq.merge(*iterables)
a = iter([1, 4, 7, 10])
b = iter([2, 3, 8, 11])
c = iter([5, 6, 9, 12])
print(list(merge_sorted(a, b, c)))Solution (with manual heap implementation)
import heapq
# Simple version using heapq.merge
def merge_sorted(*iterables):
yield from heapq.merge(*iterables)
# Manual version (educational)
def merge_sorted_manual(*iterables):
heap = []
iters = [iter(it) for it in iterables]
for idx, it in enumerate(iters):
val = next(it, None)
if val is not None:
heapq.heappush(heap, (val, idx))
while heap:
val, idx = heapq.heappop(heap)
yield val
nxt = next(iters[idx], None)
if nxt is not None:
heapq.heappush(heap, (nxt, idx))
a = [1, 4, 7, 10]
b = [2, 3, 8, 11]
c = [5, 6, 9, 12]
print(list(merge_sorted_manual(a, b, c)))
# [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
Explanation: heapq.merge is a C-implemented lazy merge — it yields the globally smallest element from all iterators at each step using a min-heap of size equal to the number of iterables. Time complexity is O(n log k) where n is total elements and k is the number of iterables. The manual implementation shows the heap entries as (value, iterator_index) tuples — the index ensures heapq never needs to compare iterators directly (which would raise TypeError).
import heapq
def merge_sorted(*iterables):
# Lazily merge multiple sorted iterables into one sorted sequence.
# Do NOT convert any iterable to a list first.
# Hint: use heapq.merge or implement with a min-heap.
pass
a = iter([1, 4, 7, 10])
b = iter([2, 3, 8, 11])
c = iter([5, 6, 9, 12])
print(list(merge_sorted(a, b, c)))Expected Output
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]Hints
Hint 1: heapq.merge(*iterables) does exactly this — use it directly.
Hint 2: To implement manually: initialise a heap with (first_value, index, iterator) tuples.
Hint 3: Pop the smallest, yield it, then push the next value from the same iterator.
Hint 4: The index in the tuple breaks ties so equal values from different iterators do not crash heapq.
