Higher-Order Functions - Functions as First-Class Citizens

Reading time: ~20 minutes | Level: Foundation → Engineering

Here is behavior that surprises most Python developers:

numbers = [1, 2, 3, 4, 5]

result = map(lambda x: x * 2, numbers)

print(result)        # <map object at 0x10f3a2b50>
print(type(result))  # <class 'map'>
print(list(result))  # [2, 4, 6, 8, 10]
print(list(result))  # []  -- empty! The iterator is exhausted.

map() returns a lazy iterator, not a list. Once you consume it, it is gone. Call list() on it twice, and the second call returns nothing.

This is not a bug - it is a deliberate memory-efficiency design. Understanding why requires understanding higher-order functions from the ground up.

What You Will Learn

What a higher-order function is and why the concept is foundational to Python
How map(), filter(), and functools.reduce() work internally, including why they are lazy
When to choose map/filter versus list comprehensions and what the tradeoffs are
How sorted(), min(), and max() are higher-order functions in disguise
How functools.partial creates specialized functions from general ones
How functools.lru_cache implements memoization as a higher-order decorator
How to write your own higher-order functions: compose, pipeline, apply_all
How these patterns appear in ML feature pipelines, Flask middleware, and retry wrappers

Prerequisites

Python 3.8+ and comfort with the REPL
Understanding of function objects and first-class functions (topic 01)
Understanding of closures (topic 09)
Understanding of decorators (topic 10)
Basic familiarity with iterators and generators (topic 11)

Mental Model: What Is a Higher-Order Function?

A higher-order function is a function that does at least one of:

Takes another function as an argument
Returns a function as its result
Both

Category	Pattern	Examples
A: Takes a function argument	`caller` passes `func + data` → `higher-order fn` → output	`map()`, `filter()`, `sorted(key=)`, `min(key=)`, `max(key=)`
B: Returns a function	`caller` passes `config` → `higher-order fn` → new callable	`functools.partial()`, `lru_cache()`, `make_multiplier()`
C: Both (most powerful)	Takes and returns functions	`compose()`, `pipeline()`, decorator factories

Python's built-in map, filter, sorted, min, and max are all higher-order functions. You use them every day without necessarily thinking of them that way.

Part 1 - `map()`: Apply a Function to Every Element

Basic Usage

def square(x):
    return x * x

numbers = [1, 2, 3, 4, 5]
squared = map(square, numbers)

print(list(squared))   # [1, 4, 9, 16, 25]

map(func, iterable) applies func to every element and returns a map object (lazy iterator). The function is not called until you consume the iterator.

Why `map` Returns an Iterator, Not a List

	Eager evaluation (if `map` returned a list)	Lazy evaluation (how `map` actually works)
Input	`[1, 2, 3, 4, 5, ...10_000_000 elements...]`	`[1, 2, 3, 4, 5, ...10_000_000 elements...]`
Behavior	Processes ALL 10M items immediately; allocates a 10M-element list in memory	Returns a map object immediately (O(1) memory); processes one item only when requested via `next()`
Memory	O(n) - proportional to input size	O(1) - constant regardless of input size

This is critical for large datasets - a common scenario in data engineering and ML pipelines.

# Demonstrating lazy evaluation
import time

def slow_transform(x):
    # Simulates expensive computation
    time.sleep(0.001)
    return x * 2

big_range = range(1_000_000)

# map() returns instantly - no computation yet
t0 = time.time()
lazy = map(slow_transform, big_range)
t1 = time.time()
print(f"map() call: {(t1 - t0) * 1000:.2f}ms")   # ~0.00ms

# Computation happens when you consume
t0 = time.time()
first_five = [next(lazy) for _ in range(5)]
t1 = time.time()
print(f"First 5:    {(t1 - t0) * 1000:.2f}ms")   # ~5ms
print(first_five)   # [0, 2, 4, 6, 8]

`map` with Multiple Iterables

map can accept multiple iterables and a function with matching arity:

xs = [1, 2, 3, 4]
ys = [10, 20, 30, 40]

sums = list(map(lambda x, y: x + y, xs, ys))
print(sums)   # [11, 22, 33, 44]

# Equivalent to zip + comprehension
sums2 = [x + y for x, y in zip(xs, ys)]
print(sums2)  # [11, 22, 33, 44]

map stops when the shortest iterable is exhausted - same behavior as zip.

`map` vs List Comprehension

numbers = [1, 2, 3, 4, 5]

# Using map
doubled_map = list(map(lambda x: x * 2, numbers))

# Using list comprehension
doubled_comp = [x * 2 for x in numbers]

# Both produce [2, 4, 6, 8, 10]

:::tip When to Prefer Each Use list comprehensions when the transformation is simple and inline - they are more readable to most Python developers.

Use map when you already have a named function (avoids a redundant lambda), when you want lazy evaluation without materializing the list, or when chaining with other lazy operations.

# map shines when the function already exists
import math
numbers = [1.0, 4.0, 9.0, 16.0]

# Avoid: redundant lambda wrapping an existing function
list(map(lambda x: math.sqrt(x), numbers))

# Better: pass the function directly
list(map(math.sqrt, numbers))    # [1.0, 2.0, 3.0, 4.0]

:::

Part 2 - `filter()`: Keep Elements Where the Predicate Is True

Basic Usage

def is_even(n):
    return n % 2 == 0

numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
evens = filter(is_even, numbers)

print(list(evens))   # [2, 4, 6, 8, 10]

filter(func, iterable) keeps only those elements for which func returns a truthy value. Like map, it returns a lazy filter object.

The `None` Shorthand

Passing None as the function filters out all falsy values:

mixed = [0, 1, "", "hello", None, [1, 2], [], False, True, 42]
truthy_only = list(filter(None, mixed))
print(truthy_only)   # [1, 'hello', [1, 2], True, 42]

This is a concise way to remove empty strings, zeros, None, empty lists, and False from a sequence.

`filter` vs List Comprehension with Condition

numbers = range(20)

# filter
odds_filter = list(filter(lambda x: x % 2 != 0, numbers))

# comprehension
odds_comp = [x for x in numbers if x % 2 != 0]

# Both: [1, 3, 5, 7, 9, 11, 13, 15, 17, 19]

Chaining `map` and `filter` for Lazy Pipelines

The real power emerges when you chain lazy iterators - no intermediate lists are created:

import math

raw_data = range(-10, 11)   # -10 to 10

# Chain: filter negatives, then apply sqrt, all lazily
non_negative = filter(lambda x: x >= 0, raw_data)
square_roots = map(math.sqrt, non_negative)

# Nothing has been computed yet - only when we consume
result = list(square_roots)
print(result)
# [0.0, 1.0, 1.4142..., 1.7320..., 2.0, 2.2360..., ..., 3.1622...]

Part 3 - `functools.reduce()`: Fold an Iterable to a Single Value

What `reduce` Does

reduce applies a two-argument function cumulatively to the elements of an iterable, reducing it to a single value. It is a left fold.

from functools import reduce

numbers = [1, 2, 3, 4, 5]

total = reduce(lambda acc, x: acc + x, numbers)
print(total)   # 15

The `initial` Argument

Always provide an initial value when the iterable might be empty:

from functools import reduce

# Without initial - crashes on empty list
try:
    reduce(lambda acc, x: acc + x, [])
except TypeError as e:
    print(e)   # reduce() of empty iterable with no initial value

# With initial - safe
result = reduce(lambda acc, x: acc + x, [], 0)
print(result)   # 0

# Initial value also sets the starting accumulator
product = reduce(lambda acc, x: acc * x, [1, 2, 3, 4], 1)
print(product)   # 24

Practical `reduce` Examples

from functools import reduce

# Flatten a list of lists
nested = [[1, 2], [3, 4], [5, 6]]
flat = reduce(lambda acc, x: acc + x, nested, [])
print(flat)   # [1, 2, 3, 4, 5, 6]

# Find the maximum without max()
numbers = [3, 1, 4, 1, 5, 9, 2, 6]
maximum = reduce(lambda a, b: a if a > b else b, numbers)
print(maximum)   # 9

# Count word frequencies
words = ["apple", "banana", "apple", "cherry", "banana", "apple"]
freq = reduce(
    lambda acc, w: {**acc, w: acc.get(w, 0) + 1},
    words,
    {}
)
print(freq)   # {'apple': 3, 'banana': 2, 'cherry': 1}

:::warning When NOT to Use reduce reduce is powerful but can become unreadable fast. Python provides clearer alternatives for most common reductions:

from functools import reduce

numbers = [1, 2, 3, 4, 5]

# Avoid this:
total = reduce(lambda a, b: a + b, numbers)

# Prefer these:
total   = sum(numbers)          # built-in sum
maximum = max(numbers)          # built-in max
flat    = [x for sub in nested for x in sub]   # comprehension for flattening

Reserve reduce for genuinely custom fold operations that have no built-in equivalent. :::

Part 4 - `sorted()`, `min()`, `max()` as Higher-Order Functions

These three built-ins accept a key= argument - a function applied to each element before comparison. This makes them higher-order functions.

products = [
    {"name": "Laptop",   "price": 999,  "stock": 3},
    {"name": "Monitor",  "price": 299,  "stock": 12},
    {"name": "Keyboard", "price": 79,   "stock": 0},
    {"name": "Mouse",    "price": 49,   "stock": 25},
]

# Sort by price ascending
by_price = sorted(products, key=lambda p: p["price"])
print([p["name"] for p in by_price])
# ['Mouse', 'Keyboard', 'Monitor', 'Laptop']

# Sort by stock descending
by_stock = sorted(products, key=lambda p: p["stock"], reverse=True)
print([p["name"] for p in by_stock])
# ['Mouse', 'Monitor', 'Laptop', 'Keyboard']

# Most expensive item
priciest = max(products, key=lambda p: p["price"])
print(priciest["name"])   # Laptop

# Cheapest in-stock item
in_stock = [p for p in products if p["stock"] > 0]
cheapest = min(in_stock, key=lambda p: p["price"])
print(cheapest["name"])   # Mouse

`operator` Module: Pre-built Key Functions

The operator module provides efficient, named alternatives to common lambdas:

from operator import itemgetter, attrgetter

products = [
    {"name": "Laptop",   "price": 999},
    {"name": "Monitor",  "price": 299},
    {"name": "Keyboard", "price": 79},
]

# itemgetter is faster than lambda for dict access
by_price = sorted(products, key=itemgetter("price"))
print([p["name"] for p in by_price])
# ['Keyboard', 'Monitor', 'Laptop']

# attrgetter for objects with attributes
from dataclasses import dataclass

@dataclass
class Product:
    name: str
    price: float

items = [Product("Laptop", 999), Product("Monitor", 299)]
by_price = sorted(items, key=attrgetter("price"))
print([i.name for i in by_price])   # ['Monitor', 'Laptop']

:::note Performance Note operator.itemgetter("key") is implemented in C and is measurably faster than lambda x: x["key"] in tight loops over large datasets. In ML feature engineering where you sort millions of records, this difference compounds. :::

Part 5 - `functools.partial`: Pre-Filling Arguments

partial creates a new function with some arguments already bound. It is like calling a function with a template.

from functools import partial

def power(base, exponent):
    return base ** exponent

# Create specialized functions
square = partial(power, exponent=2)
cube   = partial(power, exponent=3)

print(square(4))   # 16
print(cube(3))     # 27

# partial works with positional args too
triple_base = partial(power, 3)   # base=3 fixed

print(triple_base(4))  # 81  (3 ** 4)

`partial` in Practice: Configuring Functions for Pipelines

from functools import partial
import json

def serialize(data, indent=None, sort_keys=False, ensure_ascii=True):
    return json.dumps(data, indent=indent, sort_keys=sort_keys,
                      ensure_ascii=ensure_ascii)

# Create application-specific serializers
pretty_json  = partial(serialize, indent=2, sort_keys=True)
compact_json = partial(serialize, ensure_ascii=False)

data = {"name": "Alice", "age": 30, "city": "Berlin"}

print(pretty_json(data))
# {
#   "age": 30,
#   "city": "Berlin",
#   "name": "Alice"
# }

print(compact_json(data))
# {"name": "Alice", "age": 30, "city": "Berlin"}

# Real-world: partial to configure retry behavior
from functools import partial

def api_call(endpoint, retries=3, timeout=5.0, auth_token=None):
    print(f"Calling {endpoint} (retries={retries}, timeout={timeout})")

# Create a version pre-configured for production settings
prod_api = partial(api_call, retries=5, timeout=10.0, auth_token="secret-token")

# Call sites only need to specify the endpoint
prod_api("/users")    # Calling /users (retries=5, timeout=10.0)
prod_api("/courses")  # Calling /courses (retries=5, timeout=10.0)

How `partial` Works Internally

Part 6 - `functools.lru_cache`: Memoization as a Higher-Order Decorator

lru_cache (Least Recently Used cache) wraps a function so that results of previous calls are stored and returned on repeated calls with the same arguments.

from functools import lru_cache
import time

@lru_cache(maxsize=128)
def expensive_compute(n):
    """Fibonacci - exponential without caching, linear with it."""
    if n <= 1:
        return n
    return expensive_compute(n - 1) + expensive_compute(n - 2)

t0 = time.perf_counter()
print(expensive_compute(40))   # 102334155
t1 = time.perf_counter()
print(f"{(t1 - t0) * 1000:.2f}ms")   # ~0.1ms

# Check cache stats
print(expensive_compute.cache_info())
# CacheInfo(hits=38, misses=41, maxsize=128, currsize=41)

Without lru_cache, computing fib(40) would require over 300 million recursive calls. With it, 41 unique subproblems are solved once and reused.

`lru_cache` Is a Higher-Order Function

lru_cache(maxsize=128) takes maxsize and returns a decorator. That decorator takes a function and returns a new function (the memoized wrapper):

# These two are identical:

@lru_cache(maxsize=128)
def fib(n):
    if n <= 1:
        return n
    return fib(n - 1) + fib(n - 2)

# Equivalent to:
def fib_plain(n):
    if n <= 1:
        return n
    return fib_plain(n - 1) + fib_plain(n - 2)

fib_plain = lru_cache(maxsize=128)(fib_plain)
# lru_cache(maxsize=128) → returns decorator
# decorator(fib_plain)   → returns memoized wrapper

Cache Size and `maxsize=None`

from functools import lru_cache

# maxsize=None means unlimited cache (equivalent to @cache in Python 3.9+)
@lru_cache(maxsize=None)
def fib(n):
    if n <= 1:
        return n
    return fib(n - 1) + fib(n - 2)

# Python 3.9+ shorthand
from functools import cache

@cache   # same as lru_cache(maxsize=None)
def fib_v2(n):
    if n <= 1:
        return n
    return fib_v2(n - 1) + fib_v2(n - 2)

:::warning lru_cache Requires Hashable Arguments The cache key is a tuple of the function's arguments. Arguments must be hashable. Lists, dicts, and sets cannot be cache keys.

@lru_cache(maxsize=128)
def process(data):
    return sum(data)

process([1, 2, 3])   # TypeError: unhashable type: 'list'

# Fix: use tuple
@lru_cache(maxsize=128)
def process(data: tuple):
    return sum(data)

process((1, 2, 3))   # Works: 6

:::

Part 7 - Writing Your Own Higher-Order Functions

`apply_all`: Apply Multiple Functions to One Value

def apply_all(value, *funcs):
    """Apply every function in funcs to value. Return list of results."""
    return [f(value) for f in funcs]

import math

results = apply_all(
    16,
    math.sqrt,
    lambda x: x ** 2,
    lambda x: x // 3,
    str,
)
print(results)   # [4.0, 256, 5, '16']

`compose`: Build a Right-to-Left Function Chain

Function composition means compose(f, g)(x) == f(g(x)). Mathematically: (f o g)(x).

from functools import reduce

def compose(*funcs):
    """
    Return a function that applies funcs right-to-left.
    compose(f, g, h)(x)  ==  f(g(h(x)))
    """
    def composed(value):
        return reduce(lambda acc, f: f(acc), reversed(funcs), value)
    return composed

# Individual transforms
def strip_whitespace(s: str) -> str:
    return s.strip()

def to_lowercase(s: str) -> str:
    return s.lower()

def replace_spaces(s: str) -> str:
    return s.replace(" ", "_")

def add_prefix(s: str) -> str:
    return f"user_{s}"

# Compose right-to-left: strip → lowercase → replace → prefix
normalize = compose(add_prefix, replace_spaces, to_lowercase, strip_whitespace)

print(normalize("  Alice Smith  "))   # user_alice_smith
print(normalize("  BOB JONES   "))   # user_bob_jones

compose(f, g, h)(x) is equivalent to f(g(h(x))) - data flows right-to-left through the composition: x → h(x) → g(h(x)) → f(g(h(x))).

`pipeline`: Left-to-Right Composition (More Readable)

Many engineers prefer left-to-right data flow - the order you read it is the order it executes:

def pipeline(*funcs):
    """
    Return a function that applies funcs left-to-right.
    pipeline(f, g, h)(x)  ==  h(g(f(x)))
    """
    def run(value):
        for f in funcs:
            value = f(value)
        return value
    return run

# Read top-to-bottom: each step flows into the next
clean_name = pipeline(
    str.strip,
    str.lower,
    lambda s: s.replace(" ", "_"),
    lambda s: f"user_{s}",
)

print(clean_name("  Alice Smith  "))   # user_alice_smith

Part 8 - Practical Patterns in Real Engineering

ML Feature Engineering Pipeline

import math
from functools import partial

def log_transform(x: float) -> float:
    return math.log1p(max(x, 0))

def normalize(x: float, mean: float = 5.0, std: float = 2.0) -> float:
    return (x - mean) / std

def clip(x: float, low: float = -3.0, high: float = 3.0) -> float:
    return max(low, min(high, x))

def pipeline(*funcs):
    def run(value):
        for f in funcs:
            value = f(value)
        return value
    return run

# Build a reusable feature transform for an income column
income_transform = pipeline(
    log_transform,
    partial(normalize, mean=8.5, std=1.2),
    partial(clip, low=-3.0, high=3.0),
)

raw_incomes = [0, 25000, 75000, 150000, 500000]
transformed = list(map(income_transform, raw_incomes))
print([round(v, 3) for v in transformed])
# [-3.0, -2.765, 0.047, 1.388, 3.0]

Validation Pipeline

from typing import Callable, Any

def make_validator(*checks: Callable[[Any], bool]):
    """
    Return a function that checks all conditions.
    Returns (is_valid, list_of_failed_check_names).
    """
    def validate(value):
        failures = [
            check.__name__
            for check in checks
            if not check(value)
        ]
        return len(failures) == 0, failures
    return validate

def is_nonempty(s: str) -> bool:
    return bool(s.strip())

def is_email(s: str) -> bool:
    return "@" in s and "." in s.split("@")[-1]

def is_short_enough(s: str) -> bool:
    return len(s) <= 254

validate_email = make_validator(is_nonempty, is_email, is_short_enough)

print(validate_email("[email protected]"))   # (True, [])
print(validate_email("not-an-email"))        # (False, ['is_email'])
print(validate_email(""))                    # (False, ['is_nonempty', 'is_email'])

Retry Wrapper (Middleware Pattern)

import time
from functools import wraps

def retry(max_attempts: int = 3, delay: float = 1.0, exceptions=(Exception,)):
    """
    Higher-order function: wraps a function with retry logic.
    Returns a decorator.
    """
    def decorator(func):
        @wraps(func)
        def wrapper(*args, **kwargs):
            last_error = None
            for attempt in range(1, max_attempts + 1):
                try:
                    return func(*args, **kwargs)
                except exceptions as e:
                    last_error = e
                    if attempt < max_attempts:
                        print(f"Attempt {attempt} failed: {e}. Retrying...")
                        time.sleep(delay)
            raise last_error
        return wrapper
    return decorator

@retry(max_attempts=3, delay=0.5, exceptions=(ConnectionError, TimeoutError))
def fetch_data(url: str) -> dict:
    import random
    if random.random() < 0.6:
        raise ConnectionError("Network blip")
    return {"url": url, "data": "..."}

# This function will automatically retry up to 3 times
try:
    result = fetch_data("https://api.example.com/data")
    print(result)
except ConnectionError:
    print("All retries exhausted.")

Part 9 - `map`/`filter`/`reduce` vs Comprehensions: The Decision Guide

Prefer	When
`map()`	You already have a named function (no lambda needed)
`map()`	You want lazy evaluation (not materializing the full list)
`map()`	Chaining with other lazy operations (`filter` then `map` then ...)
List comprehension	The transform is simple and readable inline
List comprehension	You need a list immediately (not a lazy iterator)
List comprehension	The expression is more readable than a lambda
List comprehension	You want to combine filter+map in one expression
`reduce()`	No built-in exists (`sum`, `max`, `min`, `any`, `all` cover most cases)
`reduce()`	The accumulation logic is genuinely custom
Built-ins (`sum`, `max`, `min`, `any`, `all`)	They match your use case - they're faster and more readable than `reduce`

Operation	`map`/`filter`	Comprehension	Winner
Apply named function	`map(math.sqrt, nums)`	`[math.sqrt(x) for x in nums]`	`map` (cleaner)
Simple inline transform	`map(lambda x: x*2, nums)`	`[x*2 for x in nums]`	comprehension
Filter + transform	`map(f, filter(p, nums))`	`[f(x) for x in nums if p(x)]`	comprehension
Lazy pipeline	`map(f, filter(p, nums))`	N/A	`map`/`filter`
Memory efficiency	lazy, O(1)	eager, O(n)	`map`/`filter`

Interview Questions

Q1: What is a higher-order function? Give three examples from the Python standard library.

Answer: A higher-order function takes a function as an argument, returns a function, or both. Built-in examples from the standard library:

map(func, iterable) - takes a callable and an iterable, applies the callable to each element, returns a lazy iterator
filter(func, iterable) - takes a predicate function, keeps only elements for which it returns True, lazy iterator
sorted(iterable, key=func) - takes a key function that transforms each element before comparison
functools.lru_cache(maxsize=N) - takes maxsize config, returns a decorator (function that takes a function and returns a memoized wrapper)
functools.partial(func, *args, **kwargs) - takes a function and partial arguments, returns a new callable with those arguments pre-bound

Q2: Why does `map()` return an iterator instead of a list in Python 3? What was the change from Python 2?

Answer: In Python 2, map() returned a list - it eagerly computed all results and stored them in memory. Python 3 changed it to return a lazy iterator for two reasons: (1) memory efficiency - a lazy iterator processes one element at a time in O(1) space rather than allocating an O(n) list; (2) composability - lazy iterators can be chained (filter then map then map) without intermediate allocations. The cost is that the iterator is single-pass: once consumed, it is exhausted. Use list(map(...)) to get the old Python 2 behavior. The same philosophy changed filter(), zip(), range(), and dict.keys()/values()/items() in Python 3.

Q3: What is `functools.partial` and when would you use it instead of a lambda?

Answer: partial(func, *args, **kwargs) returns a new callable with some arguments of func pre-filled. When called, it merges the stored arguments with new ones and calls func. Use partial over lambda when: (1) the pre-filled function already has a good name (partial(json.dumps, indent=2) is clearer than lambda d: json.dumps(d, indent=2)); (2) you need the wrapper to be picklable - lambda functions cannot be pickled, but partial objects can, which matters for multiprocessing; (3) you want to inspect the original function via wrapper.func, wrapper.args, and wrapper.keywords. Lambda is fine for short, inline, single-use transformations where clarity is not sacrificed.

Q4: Implement `compose(f, g)` so that `compose(f, g)(x) == f(g(x))`. Then extend it to accept any number of functions.

Answer:

# Two-function version
def compose(f, g):
    def composed(x):
        return f(g(x))
    return composed

import math
sqrt_then_negate = compose(lambda x: -x, math.sqrt)
print(sqrt_then_negate(9))   # -3.0

# Variadic version using reduce (applies rightmost function first)
from functools import reduce

def compose(*funcs):
    """compose(f, g, h)(x) == f(g(h(x)))"""
    def composed(value):
        return reduce(lambda acc, f: f(acc), reversed(funcs), value)
    return composed

strip_and_lower = compose(str.lower, str.strip)
print(strip_and_lower("  HELLO  "))   # "hello"

The key insight: reversed(funcs) applies the rightmost function first, matching mathematical composition notation.

Q5: What does `functools.lru_cache` do internally? What are its limitations?

Answer: lru_cache(maxsize=N) returns a decorator that wraps a function with a dictionary-based cache. When the wrapped function is called, it constructs a cache key from the argument tuple. If the key is in the cache, the stored result is returned immediately (a cache hit). Otherwise the original function is called, the result is stored, and it is returned (a cache miss). When the cache is full, the least recently used entry is evicted. Limitations: (1) arguments must be hashable - lists, dicts, and sets cannot be cache keys; (2) the cache is not shared across processes or machines; (3) on instance methods, lru_cache can cause memory leaks because the cache holds a reference to self, preventing garbage collection - prefer module-level functions or manage cache size carefully; (4) it adds per-call overhead even on misses.

Q6: What is the difference between `map`/`filter` and a generator expression for lazy evaluation?

Answer: Both are lazy - they compute values on demand and are single-pass iterators. Differences: (1) syntax - map(func, it) requires a callable; a generator expression (func(x) for x in it) can inline arbitrary expressions; (2) map with a named function avoids a lambda wrapper and is slightly faster in CPython because no extra stack frame for a lambda is created; (3) generator expressions are more flexible - they support multiple for clauses and complex conditions that would need a lambda with map; (4) in practice, generator expressions are more Pythonic for inline transforms while map with a named function is idiomatic for applying an existing callable. Example: map(str.strip, lines) is cleaner than (line.strip() for line in lines) - but (x**2 for x in nums if x > 0) is cleaner than map(lambda x: x**2, filter(lambda x: x > 0, nums)).

Practice Challenges

Level 1 - Beginner: Transform a Dataset with `map` and `filter`

Problem: You have a list of raw temperature readings in Fahrenheit (some invalid - below -100 or above 200). Filter out invalid readings, convert valid ones to Celsius, and round to one decimal place.

raw_temps_f = [32, 98.6, -150, 212, 77, 250, -40, 0, 37.4]
# Expected: [0.0, 37.0, 100.0, 25.0, -40.0, -17.6]

Solution

raw_temps_f = [32, 98.6, -150, 212, 77, 250, -40, 0, 37.4]

def is_valid(f):
    return -100 <= f <= 200

def fahrenheit_to_celsius(f):
    return round((f - 32) * 5 / 9, 1)

# Using filter + map (lazy pipeline)
valid_f = filter(is_valid, raw_temps_f)
celsius = map(fahrenheit_to_celsius, valid_f)
result  = list(celsius)

print(result)   # [0.0, 37.0, 100.0, 25.0, -40.0, -17.6]

# Equivalent one-liner with comprehension:
result2 = [
    round((f - 32) * 5 / 9, 1)
    for f in raw_temps_f
    if -100 <= f <= 200
]
print(result2)  # [0.0, 37.0, 100.0, 25.0, -40.0, -17.6]

Both approaches produce the same result. The map/filter version is lazy - useful if raw_temps_f were a stream of millions of readings. The comprehension is more readable for a standalone computation.

Level 2 - Intermediate: Build a Composable Data Cleaning Pipeline

Problem: Build a text-cleaning pipeline for user-submitted tags. Each tag should be stripped of whitespace, lowercased, have internal whitespace replaced with hyphens, have non-alphanumeric characters (except hyphens) removed, and be no longer than 30 characters. Use pipeline() and partial to compose the transforms. Apply the pipeline to a list of raw tags with map, and remove any empty results with filter.

raw_tags = [
    "  Machine Learning  ",
    "Python!!!",
    "data   science",
    "A" * 40,       # too long
    "   ",          # becomes empty after cleaning
    "NLP / Tokenization",
]

Solution

import re
from functools import partial

def pipeline(*funcs):
    """Apply funcs left-to-right to a value."""
    def run(value):
        for f in funcs:
            value = f(value)
        return value
    return run

def replace_spaces(s: str) -> str:
    return re.sub(r'\s+', '-', s)

def remove_invalid_chars(s: str) -> str:
    return re.sub(r'[^a-z0-9\-]', '', s)

def truncate(s: str, max_len: int = 30) -> str:
    return s[:max_len]

# Compose the cleaning pipeline
clean_tag = pipeline(
    str.strip,
    str.lower,
    replace_spaces,
    remove_invalid_chars,
    partial(truncate, max_len=30),
)

raw_tags = [
    "  Machine Learning  ",
    "Python!!!",
    "data   science",
    "A" * 40,
    "   ",
    "NLP / Tokenization",
]

# Apply pipeline lazily, filter out empty results
cleaned = list(filter(None, map(clean_tag, raw_tags)))
print(cleaned)
# ['machine-learning', 'python', 'data---science',
#  'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa', 'nlp--tokenization']

# The pipeline is reusable:
more_tags = ["  AI   ", "PyTorch!!", "  TensorFlow  "]
print(list(filter(None, map(clean_tag, more_tags))))
# ['ai', 'pytorch', 'tensorflow']

Level 3 - Advanced: Implement `lru_cache` from Scratch

Problem: Implement two things: (1) a memoize decorator that caches results using a dictionary; (2) a lru_memoize(maxsize) decorator that evicts the least recently used entry when the cache is full. Use your lru_memoize to cache a recursive function that computes the sum of values in a nested dictionary tree. Demonstrate cache hits on repeated subtree queries.

Solution

from functools import wraps
from collections import OrderedDict

# Part 1: Simple memoize
def memoize(func):
    """Cache results of func keyed by its arguments."""
    cache = {}

    @wraps(func)
    def wrapper(*args, **kwargs):
        key = (args, tuple(sorted(kwargs.items())))
        if key not in cache:
            cache[key] = func(*args, **kwargs)
        else:
            wrapper.hits += 1
        return cache[key]

    wrapper.cache = cache
    wrapper.hits  = 0
    return wrapper


# Part 2: LRU memoize using OrderedDict
def lru_memoize(maxsize: int = 128):
    """LRU cache decorator with configurable maximum size."""
    def decorator(func):
        cache = OrderedDict()   # insertion-ordered dict

        @wraps(func)
        def wrapper(*args, **kwargs):
            key = (args, tuple(sorted(kwargs.items())))

            if key in cache:
                cache.move_to_end(key)   # mark as most recently used
                wrapper.hits += 1
                return cache[key]

            result = func(*args, **kwargs)
            cache[key] = result
            cache.move_to_end(key)
            wrapper.misses += 1

            if len(cache) > maxsize:
                cache.popitem(last=False)   # remove oldest (LRU)

            return result

        wrapper.cache  = cache
        wrapper.hits   = 0
        wrapper.misses = 0

        def cache_info():
            return {
                "hits":    wrapper.hits,
                "misses":  wrapper.misses,
                "maxsize": maxsize,
                "currsize": len(cache),
            }
        wrapper.cache_info = cache_info
        return wrapper
    return decorator


# Part 3: Tree-traversal with LRU cache
@lru_memoize(maxsize=64)
def tree_sum(node_id: str, tree: tuple) -> int:
    """
    Compute sum of all numeric leaf values in a nested dict tree.
    tree is passed as a tuple of (key, value) pairs to make it hashable.
    """
    tree_dict = dict(tree)
    node = tree_dict[node_id]

    if isinstance(node, int):
        return node

    if isinstance(node, dict):
        return sum(
            tree_sum(child_id, tree)
            for child_id in node.get("children", [])
        )
    return 0


tree_data = {
    "root": {"children": ["a", "b", "c"]},
    "a":    {"children": ["a1", "a2"]},
    "b":    30,
    "c":    {"children": ["c1"]},
    "a1":   10,
    "a2":   20,
    "c1":   40,
}
tree_tuple = tuple(sorted(tree_data.items()))

# First query - all misses
print(tree_sum("root", tree_tuple))   # 100
print(tree_sum.cache_info())
# {'hits': 0, 'misses': 6, 'maxsize': 64, 'currsize': 6}

# Repeated queries - all hits
print(tree_sum("root", tree_tuple))   # 100
print(tree_sum("a", tree_tuple))      # 30  (a1=10 + a2=20)
print(tree_sum("c", tree_tuple))      # 40
print(tree_sum.cache_info())
# {'hits': 3, 'misses': 6, 'maxsize': 64, 'currsize': 6}

# Demonstrate LRU eviction with tiny cache
@lru_memoize(maxsize=2)
def slow_square(n):
    return n * n

slow_square(1)   # miss, cache: {(1,..): 1}
slow_square(2)   # miss, cache: {(1,..): 1, (2,..): 4}
slow_square(3)   # miss, full - evict key 1, cache: {(2,..): 4, (3,..): 9}
slow_square(2)   # hit
slow_square(1)   # miss again (was evicted)
print(slow_square.cache_info())
# {'hits': 1, 'misses': 4, 'maxsize': 2, 'currsize': 2}

Key engineering insights:

OrderedDict.move_to_end(key) is O(1) - LRU tracking without scanning the whole cache
popitem(last=False) removes the oldest entry - O(1) eviction
Passing mutable structures (dicts) requires converting to hashable types (tuples of pairs) before using as cache keys
functools.lru_cache is implemented in C and is far faster than this Python version - use the standard library version in production

Quick Reference

Function	Signature	Returns	Lazy?	Notes
`map`	`map(func, *iterables)`	`map` object	Yes	Stops at shortest iterable
`filter`	`filter(func, iterable)`	`filter` object	Yes	`None` as func filters falsy values
`reduce`	`reduce(func, iterable, initial)`	single value	No	Import from `functools`
`sorted`	`sorted(it, key=func, reverse=bool)`	`list`	No	Stable sort; `key=` is HOF
`min`	`min(it, key=func)`	element	No	`key=` applies before comparison
`max`	`max(it, key=func)`	element	No	`key=` applies before comparison
`partial`	`partial(func, args, *kw)`	`partial` object	-	Pre-fills arguments; picklable
`lru_cache`	`@lru_cache(maxsize=N)`	decorator	-	`maxsize=None` means unlimited
`cache`	`@cache` (3.9+)	decorator	-	Same as `lru_cache(maxsize=None)`
`compose(*fs)`	user-defined	function	-	`compose(f,g)(x) == f(g(x))`
`pipeline(*fs)`	user-defined	function	-	`pipeline(f,g)(x) == g(f(x))`

Key Takeaways

A higher-order function takes a function as an argument, returns a function, or both - map, filter, sorted, partial, and lru_cache are all higher-order functions in the standard library
map() and filter() return lazy iterators in Python 3, computing values on demand in O(1) memory regardless of input size - but the iterator is single-pass and exhausted after one consumption
functools.reduce() folds an iterable to a single value via left accumulation - prefer built-ins (sum, max, any) for common reductions, and reduce only for genuinely custom logic
functools.partial pre-fills arguments to create specialized functions - it is picklable (unlike lambda) and self-documenting through .func, .args, and .keywords
functools.lru_cache wraps a function with a dictionary cache and requires hashable arguments - use cache_info() to monitor hit rates and tune maxsize
Function composition (compose) and pipelines (pipeline) let you build complex transforms from simple, testable, pure functions - the backbone of data transformation and ML feature engineering
Choose map/filter for lazy pipelines and named-function application; choose list comprehensions for inline transforms and combined filter+map where readability wins

:::note What Comes Next This lesson covered higher-order functions - taking functions as arguments and returning functions. The final lesson in this module applies everything: how to design function APIs that are clean, predictable, and a joy to use. The next page shows how first-class functions, closures, decorators, and higher-order thinking combine into professional-grade interface design. :::

What You Will Learn
Prerequisites
Mental Model: What Is a Higher-Order Function?
Part 1 - map(): Apply a Function to Every Element
Part 2 - filter(): Keep Elements Where the Predicate Is True
Part 3 - functools.reduce(): Fold an Iterable to a Single Value
Part 4 - sorted(), min(), max() as Higher-Order Functions
- operator Module: Pre-built Key Functions
Part 5 - functools.partial: Pre-Filling Arguments
- partial in Practice: Configuring Functions for Pipelines
- How partial Works Internally
Part 6 - functools.lru_cache: Memoization as a Higher-Order Decorator
- lru_cache Is a Higher-Order Function
- Cache Size and maxsize=None
Part 7 - Writing Your Own Higher-Order Functions
Part 8 - Practical Patterns in Real Engineering
Part 9 - map/filter/reduce vs Comprehensions: The Decision Guide
Interview Questions
Practice Challenges
Quick Reference
Key Takeaways

What You Will Learn​

Prerequisites​

Mental Model: What Is a Higher-Order Function?​

Part 1 - map(): Apply a Function to Every Element​

Basic Usage​

Why map Returns an Iterator, Not a List​

map with Multiple Iterables​

map vs List Comprehension​

Part 2 - filter(): Keep Elements Where the Predicate Is True​

Basic Usage​

The None Shorthand​

filter vs List Comprehension with Condition​

Chaining map and filter for Lazy Pipelines​

Part 3 - functools.reduce(): Fold an Iterable to a Single Value​

What reduce Does​

The initial Argument​

Practical reduce Examples​

Part 4 - sorted(), min(), max() as Higher-Order Functions​

operator Module: Pre-built Key Functions​

Part 5 - functools.partial: Pre-Filling Arguments​

partial in Practice: Configuring Functions for Pipelines​

How partial Works Internally​

Part 6 - functools.lru_cache: Memoization as a Higher-Order Decorator​

lru_cache Is a Higher-Order Function​

Cache Size and maxsize=None​

Part 7 - Writing Your Own Higher-Order Functions​

apply_all: Apply Multiple Functions to One Value​

compose: Build a Right-to-Left Function Chain​

pipeline: Left-to-Right Composition (More Readable)​

Part 8 - Practical Patterns in Real Engineering​

ML Feature Engineering Pipeline​

Validation Pipeline​

Retry Wrapper (Middleware Pattern)​

Part 9 - map/filter/reduce vs Comprehensions: The Decision Guide​

Interview Questions​

Q1: What is a higher-order function? Give three examples from the Python standard library.​

Q2: Why does map() return an iterator instead of a list in Python 3? What was the change from Python 2?​

Q3: What is functools.partial and when would you use it instead of a lambda?​

Q4: Implement compose(f, g) so that compose(f, g)(x) == f(g(x)). Then extend it to accept any number of functions.​

Q5: What does functools.lru_cache do internally? What are its limitations?​

Q6: What is the difference between map/filter and a generator expression for lazy evaluation?​

Practice Challenges​

Level 1 - Beginner: Transform a Dataset with map and filter​

Level 2 - Intermediate: Build a Composable Data Cleaning Pipeline​

Level 3 - Advanced: Implement lru_cache from Scratch​

Quick Reference​

Key Takeaways​

What You Will Learn

Prerequisites

Mental Model: What Is a Higher-Order Function?

Part 1 - `map()`: Apply a Function to Every Element

Basic Usage

Why `map` Returns an Iterator, Not a List

`map` with Multiple Iterables

`map` vs List Comprehension

Part 2 - `filter()`: Keep Elements Where the Predicate Is True

Basic Usage

The `None` Shorthand

`filter` vs List Comprehension with Condition

Chaining `map` and `filter` for Lazy Pipelines

Part 3 - `functools.reduce()`: Fold an Iterable to a Single Value

What `reduce` Does

The `initial` Argument

Practical `reduce` Examples

Part 4 - `sorted()`, `min()`, `max()` as Higher-Order Functions

`operator` Module: Pre-built Key Functions

Part 5 - `functools.partial`: Pre-Filling Arguments

`partial` in Practice: Configuring Functions for Pipelines

How `partial` Works Internally

Part 6 - `functools.lru_cache`: Memoization as a Higher-Order Decorator

`lru_cache` Is a Higher-Order Function

Cache Size and `maxsize=None`

Part 7 - Writing Your Own Higher-Order Functions

`apply_all`: Apply Multiple Functions to One Value

`compose`: Build a Right-to-Left Function Chain

`pipeline`: Left-to-Right Composition (More Readable)

Part 8 - Practical Patterns in Real Engineering

ML Feature Engineering Pipeline

Validation Pipeline

Retry Wrapper (Middleware Pattern)

Part 9 - `map`/`filter`/`reduce` vs Comprehensions: The Decision Guide

Interview Questions

Q1: What is a higher-order function? Give three examples from the Python standard library.

Q2: Why does `map()` return an iterator instead of a list in Python 3? What was the change from Python 2?

Q3: What is `functools.partial` and when would you use it instead of a lambda?

Q4: Implement `compose(f, g)` so that `compose(f, g)(x) == f(g(x))`. Then extend it to accept any number of functions.

Q5: What does `functools.lru_cache` do internally? What are its limitations?

Q6: What is the difference between `map`/`filter` and a generator expression for lazy evaluation?

Practice Challenges

Level 1 - Beginner: Transform a Dataset with `map` and `filter`

Level 2 - Intermediate: Build a Composable Data Cleaning Pipeline

Level 3 - Advanced: Implement `lru_cache` from Scratch

Quick Reference

Key Takeaways