Skip to main content

Python List Comprehensions Deep Dive: Practice Problems & Exercises

Practice: List Comprehensions Deep Dive

12 problems4 Easy5 Medium3 Hard45–60 min
← Back to lesson

Easy

#1Basic List Comprehension: SquaresEasy
list-comprehensiontransformbasics

Use a list comprehension to create a list of squares of numbers from 1 to 10.

Python
squares = [x * x for x in range(1, 11)]
print(squares)
Solution
squares = [x * x for x in range(1, 11)]
print(squares)

Output:

[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]

How it works: The list comprehension [x * x for x in range(1, 11)] iterates over each integer from 1 to 10. For each value of x, it computes x * x and places the result into a new list. This is equivalent to:

squares = []
for x in range(1, 11):
squares.append(x * x)

Key insight: The comprehension version is not just shorter — it compiles to bytecode that uses the LIST_APPEND opcode, avoiding the overhead of looking up and calling result.append() on every iteration. This makes it roughly 25-35% faster than the equivalent for loop.

Expected Output
[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]
Hints

Hint 1: The syntax is [expression for variable in iterable].

Hint 2: range(1, 11) produces integers from 1 through 10 inclusive.

#2Filtering with Trailing ifEasy
list-comprehensionfilterconditional

Use a list comprehension with a filter to extract all numbers from 1 to 30 that are even but NOT divisible by 3.

Python
result = [x for x in range(1, 31) if x % 2 == 0 if x % 3 != 0]
print(result)
Solution
result = [x for x in range(1, 31) if x % 2 == 0 if x % 3 != 0]
print(result)

Output:

[2, 8, 14, 20, 26]

How it works: Multiple trailing if clauses act as AND conditions. The comprehension keeps only values where both x % 2 == 0 and x % 3 != 0 are True. This is equivalent to writing if x % 2 == 0 and x % 3 != 0.

Why these numbers? Even numbers from 1-30 are 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30. Removing those also divisible by 3 (6, 12, 18, 24, 30) leaves 2, 8, 14, 20, 26. These are numbers divisible by 2 but not by 6.

Key insight: The trailing if is a filter — it reduces the number of items in the output. Do not confuse it with the ternary if-else in the expression position, which transforms values but keeps all items.

Expected Output
[2, 8, 14, 20, 26]
Hints

Hint 1: A trailing if clause filters which items are included — no else allowed.

Hint 2: You need two conditions: divisible by 2 AND not divisible by 3.

#3Ternary Expression in ComprehensionEasy
list-comprehensionternarytransform

Use a list comprehension with a ternary expression to label each number from 1 to 10 as "even" or "odd".

Python
labels = ["even" if x % 2 == 0 else "odd" for x in range(1, 11)]
print(labels)
Solution
labels = ["even" if x % 2 == 0 else "odd" for x in range(1, 11)]
print(labels)

Output:

['odd', 'even', 'odd', 'even', 'odd', 'even', 'odd', 'even', 'odd', 'even']

How it works: The ternary expression "even" if x % 2 == 0 else "odd" sits in the expression position of the comprehension — it determines what value each element has, not whether it is included. Every item from range(1, 11) produces exactly one output.

Comprehension anatomy:

[ "even" if x % 2 == 0 else "odd" for x in range(1, 11) ]
↑ conditional expression ↑ iteration
(transforms — every item kept)

Key insight: A ternary in the expression position transforms — the output list has the same length as the input. A trailing if filters — the output may be shorter. These are fundamentally different operations and cannot be combined in the same clause position.

Expected Output
['odd', 'even', 'odd', 'even', 'odd', 'even', 'odd', 'even', 'odd', 'even']
Hints

Hint 1: The ternary form is: value_if_true if condition else value_if_false.

Hint 2: This goes in the expression position (before the for), not as a trailing filter.

#4Set Comprehension: Unique DomainsEasy
set-comprehensionstring-methodsdeduplication

Use a set comprehension to extract all unique email domains from a list of addresses.

Python
emails = [
    "[email protected]",
    "[email protected]",
    "[email protected]",
    "[email protected]",
    "[email protected]",
]

domains = {email.split("@")[1] for email in emails}
print(len(domains))
print("gmail.com" in domains)
print("yahoo.com" in domains)
print("hotmail.com" in domains)
Solution
emails = [
]

domains = {email.split("@")[1] for email in emails}
print(len(domains))
print("gmail.com" in domains)
print("yahoo.com" in domains)
print("hotmail.com" in domains)

Output:

3
True
True
False

How it works: The set comprehension {email.split("@")[1] for email in emails} splits each email at @ and takes the domain part (index 1). Since sets automatically deduplicate, gmail.com and yahoo.com each appear only once despite multiple emails using them.

Key insight: Set comprehensions look like dict comprehensions but without the colon. {expr for x in it} creates a set; {k: v for x in it} creates a dict. Be careful: {} alone creates an empty dict, not an empty set. Use set() for an empty set.

Expected Output
3\nTrue\nTrue\nFalse
Hints

Hint 1: Set comprehensions use curly braces: {expression for var in iterable}.

Hint 2: str.split("@")[1] extracts the domain portion of an email address.


Medium

#5Flatten a Matrix with Nested ComprehensionMedium
nested-comprehensionflattenmatrix

Use a nested list comprehension to flatten a 3x3 matrix into a single list. Then flatten again but keep only even numbers.

Python
matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]

flat = [x for row in matrix for x in row]
print(flat)

flat_evens = [x for row in matrix for x in row if x % 2 == 0]
print(flat_evens)
Solution
matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]

flat = [x for row in matrix for x in row]
print(flat)

flat_evens = [x for row in matrix for x in row if x % 2 == 0]
print(flat_evens)

Output:

[1, 2, 3, 4, 5, 6, 7, 8, 9]
[2, 4, 6, 8]

Reading order for nested comprehensions: The for-clauses read left-to-right, matching the equivalent nested loop:

# This comprehension:
flat = [x for row in matrix for x in row]

# Is equivalent to:
flat = []
for row in matrix: # first for in the comprehension
for x in row: # second for in the comprehension
flat.append(x) # expression at the start

The outer loop (for row in matrix) appears first in the comprehension. This is the part that trips up most engineers — they expect the inner loop first because the expression x comes from the innermost loop.

Adding a filter: The trailing if x % 2 == 0 applies to the innermost loop, filtering individual elements after they are extracted from each row.

Key insight: For deeply nested structures or complex filtering, consider itertools.chain.from_iterable(matrix) instead — it is often more readable and equally performant for simple flattening.

Expected Output
[1, 2, 3, 4, 5, 6, 7, 8, 9]\n[2, 4, 6, 8]
Hints

Hint 1: For flattening, the outer for comes first (left to right): [x for row in matrix for x in row].

Hint 2: You can add a trailing if to filter elements during the flatten.

#6Dict Comprehension: Invert and FilterMedium
dict-comprehensioninvertfilter

Use dict comprehensions to first filter a dictionary (keep only items where the value is greater than 1), then invert the filtered result (swap keys and values).

Python
data = {"a": 1, "b": 2, "c": 3, "d": 4}

filtered = {k: v for k, v in data.items() if v > 1}
print(filtered)

inverted = {v: k for k, v in filtered.items()}
print(inverted)
Solution
data = {"a": 1, "b": 2, "c": 3, "d": 4}

filtered = {k: v for k, v in data.items() if v > 1}
print(filtered)

inverted = {v: k for k, v in filtered.items()}
print(inverted)

Output:

{'b': 2, 'c': 3, 'd': 4}
{2: 'b', 3: 'c', 4: 'd'}

How it works:

  1. Filtering: {k: v for k, v in data.items() if v > 1} iterates over all key-value pairs and keeps only those where the value exceeds 1. The item "a": 1 is excluded.

  2. Inverting: {v: k for k, v in filtered.items()} swaps each key-value pair — the old value becomes the new key and vice versa.

Warning about inverting: Inversion only works correctly when values are unique and hashable. If two keys share the same value, only the last one survives (dict keys must be unique). For example, inverting {"a": 1, "b": 1} yields {1: "b"}"a" is silently lost.

Key insight: Dict comprehensions are the Pythonic way to transform, filter, and reshape dictionaries. They replace verbose loops like new_dict = {}; for k, v in data.items(): if v > 1: new_dict[k] = v with a single, readable expression.

Expected Output
{'b': 2, 'c': 3, 'd': 4}\n{2: 'b', 3: 'c', 4: 'd'}
Hints

Hint 1: Dict comprehension syntax: {key_expr: value_expr for var in iterable if condition}.

Hint 2: To invert a dict, swap keys and values: {v: k for k, v in original.items()}.

#7Generator Expression with sum, any, allMedium
generator-expressionsumanyalllazy

Use generator expressions (not list comprehensions) with sum, any, and all to answer questions about a dataset without building intermediate lists.

Python
data = [3, 7, 12, 5, 20, 8, 15, 2, 18, 10]

# Sum of squares of all elements
total = sum(x * x for x in data)
print(total)

# Is any element greater than 15?
has_large = any(x > 15 for x in data)
print(has_large)

# Are all elements greater than 5?
all_above_five = all(x > 5 for x in data)
print(all_above_five)

# Are all elements positive?
all_positive = all(x > 0 for x in data)
print(all_positive)
Solution
data = [3, 7, 12, 5, 20, 8, 15, 2, 18, 10]

total = sum(x * x for x in data)
print(total)

has_large = any(x > 15 for x in data)
print(has_large)

all_above_five = all(x > 5 for x in data)
print(all_above_five)

all_positive = all(x > 0 for x in data)
print(all_positive)

Output:

285
True
False
True

Why generators here, not list comprehensions?

  • sum(), any(), and all() consume their input in a single pass. They do not need random access or len(). A generator produces one value at a time without allocating a list.
  • any() short-circuits — it stops as soon as it finds a True value. When checking any(x > 15 for x in data), it stops at 20 (index 4) without examining elements 5-9. A list comprehension would evaluate all 10 elements first.
  • all() short-circuits at the first False. For all(x > 5 for x in data), it stops at 3 (index 0).

Memory difference: For data of size N:

  • sum([x * x for x in data]) — allocates a list of N integers, then sums it
  • sum(x * x for x in data) — processes one integer at a time, O(1) memory

Key insight: When passing a generator directly as the only argument to a function, the outer parentheses of the function call serve double duty — you do not need an extra pair. sum(x*x for x in data) works; no need for sum((x*x for x in data)).

Expected Output
285\nTrue\nFalse\nTrue
Hints

Hint 1: Generator expressions use parentheses: (expr for x in iterable).

Hint 2: When passed directly to a function like sum(), you can omit the extra parentheses.

#8Build a Matrix with Nested ComprehensionMedium
nested-comprehensionmatrixconstruction

Use a nested list comprehension to build a 4x4 identity matrix (1 on the diagonal, 0 elsewhere). Print each row on its own line.

Python
n = 4
identity = [[1 if i == j else 0 for j in range(n)] for i in range(n)]

for row in identity:
    print(row)
Solution
n = 4
identity = [[1 if i == j else 0 for j in range(n)] for i in range(n)]

for row in identity:
print(row)

Output:

[1, 0, 0, 0]
[0, 1, 0, 0]
[0, 0, 1, 0]
[0, 0, 0, 1]

How to read this nested comprehension:

[[1 if i == j else 0 for j in range(n)] for i in range(n)]
↑ inner comprehension (builds one row) ↑ outer (iterates over rows)

The outer comprehension iterates i from 0 to 3 (rows). For each row i, the inner comprehension iterates j from 0 to 3 (columns) and produces 1 when i == j (diagonal) or 0 otherwise.

Building vs Flattening — the critical difference:

  • Building (matrix construction): [[expr for j in cols] for i in rows] — the outer comprehension wraps inner ones, creating a list of lists.
  • Flattening (matrix destruction): [x for row in matrix for x in row] — a single comprehension with two for-clauses, producing a flat list.

The bracket placement determines which pattern you get. Building has [[ ]] (nested brackets). Flattening has [ ] with multiple for clauses inside a single bracket pair.

Expected Output
[1, 0, 0, 0]\n[0, 1, 0, 0]\n[0, 0, 1, 0]\n[0, 0, 0, 1]
Hints

Hint 1: For building a matrix, the outer comprehension creates rows, the inner creates columns.

Hint 2: An identity matrix has 1 on the diagonal (where row index equals column index) and 0 elsewhere.

#9ETL Pipeline with Dict ComprehensionMedium
dict-comprehensionetldata-cleaningreal-world

Use a dict comprehension to clean and transform raw API data: strip and lowercase names, cast scores to integers, and include only active users.

Python
raw_records = [
    {"name": "  Alice  ", "score": "92", "active": "true"},
    {"name": "  Bob  ",   "score": "67", "active": "false"},
    {"name": "  Carol  ", "score": "88", "active": "true"},
    {"name": "  Dave  ",  "score": "45", "active": "false"},
]

active_scores = {
    r["name"].strip().lower(): int(r["score"])
    for r in raw_records
    if r["active"] == "true"
}
print(active_scores)

top_user = max(active_scores, key=active_scores.get)
print("Highest:", top_user, "(" + str(active_scores[top_user]) + ")")
Solution
raw_records = [
{"name": " Alice ", "score": "92", "active": "true"},
{"name": " Bob ", "score": "67", "active": "false"},
{"name": " Carol ", "score": "88", "active": "true"},
{"name": " Dave ", "score": "45", "active": "false"},
]

active_scores = {
r["name"].strip().lower(): int(r["score"])
for r in raw_records
if r["active"] == "true"
}
print(active_scores)

top_user = max(active_scores, key=active_scores.get)
print("Highest:", top_user, "(" + str(active_scores[top_user]) + ")")

Output:

{'alice': 92, 'carol': 88}
Highest: alice (92)

How it works: The dict comprehension performs three operations in a single pass:

  1. Filter: if r["active"] == "true" excludes inactive users (Bob, Dave)
  2. Transform keys: r["name"].strip().lower() removes whitespace and normalizes to lowercase
  3. Transform values: int(r["score"]) casts string scores to integers

This is a common Extract-Transform-Load (ETL) pattern. The equivalent loop version would be 6-8 lines. The comprehension version is a single expression that clearly communicates intent: "build a name-to-score mapping for active users."

Key insight: Dict comprehensions excel at building lookup tables from raw data. When you see yourself writing result = {}; for item in data: result[key] = value, that is a signal to use a dict comprehension instead.

Expected Output
{'alice': 92, 'carol': 88}\nHighest: alice (92)
Hints

Hint 1: Chain operations: strip whitespace, lowercase names, cast scores to int, filter by active status.

Hint 2: Use max() with a key argument to find the highest scorer.


Hard

#10Walrus Operator in ComprehensionsHard
walrus-operatorlist-comprehensionassignment-expressionoptimization

Use the walrus operator (:=) to compute the cube of each number, filter cubes between 5 and 500, and include both the original number and its cube — without computing the cube twice.

Python
numbers = range(1, 10)

# Without walrus, you would compute x**3 twice:
# [(x, x**3) for x in numbers if 5 < x**3 < 500]

# With walrus, compute once and reuse:
results = [(x, cube) for x in numbers if 5 < (cube := x ** 3) < 500]
print(results)
print(len(results))
Solution
numbers = range(1, 10)

results = [(x, cube) for x in numbers if 5 < (cube := x ** 3) < 500]
print(results)
print(len(results))

Output:

[(2, 8), (3, 27), (4, 64), (5, 125), (6, 216), (7, 343)]
6

How the walrus operator works here:

The expression (cube := x ** 3) does two things simultaneously:

  1. Computes x ** 3 and assigns the result to the name cube
  2. Returns the computed value so it can be used in the comparison 5 < ... < 500

Then in the output expression (x, cube), cube already holds the computed value — no need to call x ** 3 again.

Without the walrus operator, you would either:

  • Compute x ** 3 twice: [(x, x**3) for x in numbers if 5 < x**3 < 500]
  • Use a nested comprehension trick: [(x, c) for x in numbers for c in [x**3] if 5 < c < 500]
  • Fall back to an explicit loop

When this matters: If the expression is an expensive function call (database query, API call, complex computation), avoiding the duplicate call is a real performance win, not just a style preference.

Key insight: The walrus operator (Python 3.8+) is most valuable inside comprehensions where you need the same computed value in both the filter (if clause) and the output expression. It keeps the comprehension form viable in cases that would otherwise require falling back to a loop.

Expected Output
[(2, 8), (3, 27), (4, 64), (5, 125), (6, 216), (7, 343)]\n6
Hints

Hint 1: The walrus operator := assigns a value and returns it in a single expression.

Hint 2: Use it to avoid computing the same expensive expression twice (once for the filter, once for the output).

#11Generator Pipeline: Multi-Stage Data ProcessingHard
generator-expressionpipelinelazy-evaluationmemory-efficiency

Build a multi-stage generator pipeline that filters and transforms transaction data lazily. Each stage must be a generator expression — no intermediate lists should be materialized.

Python
def process_transactions(transactions):
    # Stage 1: Remove refunds (negative amounts)
    non_refunds = (t for t in transactions if t["amount_cents"] > 0)

    # Stage 2: Convert cents to dollars
    with_dollars = (
        {"merchant": t["merchant"], "dollars": t["amount_cents"] / 100}
        for t in non_refunds
    )

    # Stage 3: Keep only transactions over $10
    large_only = (
        (t["merchant"], t["dollars"])
        for t in with_dollars
        if t["dollars"] > 10
    )

    return large_only

# Test data
transactions = [
    {"merchant": "Amazon", "amount_cents": 4999},
    {"merchant": "Refund-Store", "amount_cents": -1500},
    {"merchant": "Coffee Shop", "amount_cents": 450},
    {"merchant": "Grocery", "amount_cents": 8732},
    {"merchant": "Gas Station", "amount_cents": 5100},
    {"merchant": "Refund-Online", "amount_cents": -2000},
    {"merchant": "Restaurant", "amount_cents": 3275},
]

for merchant, amount in process_transactions(transactions):
    print(merchant, amount)
Solution
def process_transactions(transactions):
# Stage 1: Remove refunds (negative amounts)
non_refunds = (t for t in transactions if t["amount_cents"] > 0)

# Stage 2: Convert cents to dollars
with_dollars = (
{"merchant": t["merchant"], "dollars": t["amount_cents"] / 100}
for t in non_refunds
)

# Stage 3: Keep only transactions over $10
large_only = (
(t["merchant"], t["dollars"])
for t in with_dollars
if t["dollars"] > 10
)

return large_only

transactions = [
{"merchant": "Amazon", "amount_cents": 4999},
{"merchant": "Refund-Store", "amount_cents": -1500},
{"merchant": "Coffee Shop", "amount_cents": 450},
{"merchant": "Grocery", "amount_cents": 8732},
{"merchant": "Gas Station", "amount_cents": 5100},
{"merchant": "Refund-Online", "amount_cents": -2000},
{"merchant": "Restaurant", "amount_cents": 3275},
]

for merchant, amount in process_transactions(transactions):
print(merchant, amount)

Output:

Amazon 49.99
Grocery 87.32
Gas Station 51.0
Restaurant 32.75

What happens when the for loop pulls the first item:

  1. large_only asks with_dollars for the next item
  2. with_dollars asks non_refunds for the next item
  3. non_refunds pulls from transactions, gets {"merchant": "Amazon", "amount_cents": 4999} — positive, so it yields it
  4. with_dollars transforms it to {"merchant": "Amazon", "dollars": 49.99} and yields
  5. large_only checks 49.99 > 10 — True, so it yields ("Amazon", 49.99)
  6. The for loop prints it

For "Refund-Store" (amount -1500), Stage 1 rejects it. Stage 2 and 3 never see it. For "Coffee Shop" (4.50),Stages1and2processit,butStage3rejectsitsince4.50), Stages 1 and 2 process it, but Stage 3 rejects it since 4.50 is not greater than $10.

Memory analysis: At any moment, only one transaction dict flows through the pipeline. If transactions were a file reader yielding millions of rows, the pipeline would still use O(1) memory — each row enters, gets processed or rejected, and is discarded.

Key insight: Generator pipelines are Python's answer to Unix pipes. Each generator is like a filter in cat data | grep | sed | awk. Data flows through stages on demand, with zero intermediate storage. This is the correct pattern for processing large datasets that do not fit in memory.

def process_transactions(transactions):
    """Build a lazy pipeline that:
    1. Filters out refunds (amount < 0)
    2. Converts amounts from cents to dollars
    3. Filters transactions over $10
    4. Returns (merchant, dollar_amount) tuples
    
    Each stage must be a generator — no intermediate lists.
    """
    # TODO: Implement 3-stage generator pipeline
    pass

# Test data
transactions = [
    {"merchant": "Amazon", "amount_cents": 4999},
    {"merchant": "Refund-Store", "amount_cents": -1500},
    {"merchant": "Coffee Shop", "amount_cents": 450},
    {"merchant": "Grocery", "amount_cents": 8732},
    {"merchant": "Gas Station", "amount_cents": 5100},
    {"merchant": "Refund-Online", "amount_cents": -2000},
    {"merchant": "Restaurant", "amount_cents": 3275},
]

for merchant, amount in process_transactions(transactions):
    print(merchant, amount)
Expected Output
Amazon 49.99\nGrocery 87.32\nGas Station 51.0\nRestaurant 32.75
Hints

Hint 1: Each stage should be a generator expression that feeds into the next.

Hint 2: Stage 1 filters negatives, Stage 2 transforms cents to dollars, Stage 3 filters by threshold.

Hint 3: No data flows until the final consumer (the for loop) pulls from the pipeline.

#12Comprehension vs Loop: Refactoring for PerformanceHard
performancerefactoringlist-comprehensionset-comprehensiondict-comprehension

Refactor three verbose loops into Pythonic comprehensions. The function processes server log entries and extracts error messages, unique sources, and per-source error counts.

Python
from collections import Counter

def analyze_logs(logs):
    error_messages = [
        log["message"].upper()
        for log in logs
        if log["level"] == "ERROR"
    ]

    unique_sources = {log["source"] for log in logs}

    error_counts = dict(Counter(
        log["source"]
        for log in logs
        if log["level"] == "ERROR"
    ))

    return {
        "error_messages": error_messages,
        "unique_sources": unique_sources,
        "error_counts": error_counts,
    }

# Test data
logs = [
    {"level": "INFO",  "source": "web",     "message": "Request received"},
    {"level": "ERROR", "source": "storage", "message": "Disk full"},
    {"level": "WARN",  "source": "network", "message": "High latency"},
    {"level": "ERROR", "source": "network", "message": "Connection timeout"},
    {"level": "INFO",  "source": "web",     "message": "Response sent"},
    {"level": "ERROR", "source": "storage", "message": "Disk failure"},
]

result = analyze_logs(logs)
print("Errors:", result["error_messages"])
print("Sources:", len(result["unique_sources"]))
print("Counts:", result["error_counts"])
Solution
from collections import Counter

def analyze_logs(logs):
error_messages = [
log["message"].upper()
for log in logs
if log["level"] == "ERROR"
]

unique_sources = {log["source"] for log in logs}

error_counts = dict(Counter(
log["source"]
for log in logs
if log["level"] == "ERROR"
))

return {
"error_messages": error_messages,
"unique_sources": unique_sources,
"error_counts": error_counts,
}

logs = [
{"level": "INFO", "source": "web", "message": "Request received"},
{"level": "ERROR", "source": "storage", "message": "Disk full"},
{"level": "WARN", "source": "network", "message": "High latency"},
{"level": "ERROR", "source": "network", "message": "Connection timeout"},
{"level": "INFO", "source": "web", "message": "Response sent"},
{"level": "ERROR", "source": "storage", "message": "Disk failure"},
]

result = analyze_logs(logs)
print("Errors:", result["error_messages"])
print("Sources:", len(result["unique_sources"]))
print("Counts:", result["error_counts"])

Output:

Errors: ['DISK FULL', 'CONNECTION TIMEOUT', 'DISK FAILURE']
Sources: 3
Counts: {'storage': 2, 'network': 1}

Refactoring breakdown:

1. List comprehension (filter + transform):

# Before: 4 lines with manual append
error_messages = []
for log in logs:
if log["level"] == "ERROR":
error_messages.append(log["message"].upper())

# After: 1 expression — filter with trailing if, transform with .upper()
error_messages = [log["message"].upper() for log in logs if log["level"] == "ERROR"]

2. Set comprehension (deduplication):

# Before: 3 lines with manual add
unique_sources = set()
for log in logs:
unique_sources.add(log["source"])

# After: 1 expression — set automatically deduplicates
unique_sources = {log["source"] for log in logs}

3. Counter with generator expression (aggregation):

# Before: 7 lines with manual counting
error_counts = {}
for log in logs:
if log["level"] == "ERROR":
src = log["source"]
if src not in error_counts:
error_counts[src] = 0
error_counts[src] += 1

# After: Counter + generator expression — one pass, automatic counting
error_counts = dict(Counter(log["source"] for log in logs if log["level"] == "ERROR"))

Performance notes: Each comprehension uses LIST_APPEND or SET_ADD bytecodes instead of attribute lookup + method call, giving roughly 25-35% speedup on the iteration itself. The Counter accepts a generator expression, so no intermediate list is created for the counting step.

Key insight: The three comprehension types — list, set, and dict (via Counter) — cover the vast majority of data transformation patterns. When you see a loop that builds a collection with append, add, or key assignment, it is almost always a refactoring candidate for a comprehension.

def analyze_logs_slow(logs):
    """Slow version using loops. Refactor to comprehensions.
    
    Given log entries, return a dict with:
    - 'error_messages': list of messages from ERROR logs (uppercase)
    - 'unique_sources': set of all unique source names
    - 'error_counts': dict mapping each source to its error count
    """
    # TODO: Refactor these three loops into comprehensions
    error_messages = []
    for log in logs:
        if log["level"] == "ERROR":
            error_messages.append(log["message"].upper())

    unique_sources = set()
    for log in logs:
        unique_sources.add(log["source"])

    error_counts = {}
    for log in logs:
        if log["level"] == "ERROR":
            src = log["source"]
            if src not in error_counts:
                error_counts[src] = 0
            error_counts[src] += 1

    return {
        "error_messages": error_messages,
        "unique_sources": unique_sources,
        "error_counts": error_counts,
    }
Expected Output
Errors: ['DISK FULL', 'CONNECTION TIMEOUT', 'DISK FAILURE']\nSources: 3\nCounts: {'storage': 2, 'network': 1}
Hints

Hint 1: error_messages can be a list comprehension with a filter.

Hint 2: unique_sources can be a set comprehension.

Hint 3: error_counts needs collections.Counter or a dict comprehension over grouped data.

© 2026 EngineersOfAI. All rights reserved.