12.2. Generators#

Learning goals — By the end of this section you will be able to:

  • Write generator functions using yield to produce sequences lazily

  • Explain why generators are memory-efficient compared to lists

  • Create generator expressions as a one-line alternative to generator functions

  • Use generators for infinite sequences (e.g., Fibonacci) and large-file processing

  • Chain generators into efficient data pipelines

12.2.1. Generator Functions#

A generator function looks like a regular function but uses yield instead of return. Each call to next() runs the function body until the next yield, then pauses — preserving all local variables.

def my_gen():
    yield 1   # pauses here on first next()
    yield 2   # pauses here on second next()
    yield 3   # pauses here on third next()
    # function returns → StopIteration is raised automatically

Key differences from a regular function:

  • Calling a generator function returns a generator object — it does not execute the body

  • Each next() resumes from where it left off

  • Values are produced lazily — only when requested

def count_up_to(max_count):
    """Yields integers 1, 2, ..., max_count."""
    n = 1
    while n <= max_count:
        yield n
        n += 1

# Calling the function returns a generator object — nothing executes yet
gen = count_up_to(5)
print(type(gen))          # <class 'generator'>

# Values are produced one at a time
print(next(gen))          # 1
print(next(gen))          # 2

# for loops consume the remaining values
for val in gen:
    print(val, end=' ')   # 3 4 5
print()

# A fresh generator starts from the beginning
print(list(count_up_to(5)))   # [1, 2, 3, 4, 5]
<class 'generator'>
1
2
3 4 5 
[1, 2, 3, 4, 5]

12.2.2. Generator Expressions#

A generator expression has the same syntax as a list comprehension, but uses () instead of []. The result is a generator, not a list — values are computed on demand.

# List comprehension — builds the entire list in memory immediately
squares_list = [x ** 2 for x in range(1_000_000)]   # uses ~8 MB

# Generator expression — produces values one at a time
squares_gen  = (x ** 2 for x in range(1_000_000))   # uses ~200 bytes

Use a generator expression whenever you only iterate through the result once.

import sys

# Compare memory: list vs generator
n = 100_000
squares_list = [x ** 2 for x in range(n)]
squares_gen  = (x ** 2 for x in range(n))

print(f"List size:      {sys.getsizeof(squares_list):,} bytes")
print(f"Generator size: {sys.getsizeof(squares_gen):,} bytes")

# Generator expressions compose well with sum(), max(), any(), etc.
total      = sum(x ** 2 for x in range(1001))             # no intermediate list
first_even = next(x for x in range(1, 100) if x % 7 == 0) # 7
print(f"Sum of squares 0-1000: {total}")
print(f"First multiple of 7:   {first_even}")
List size:      800,984 bytes
Generator size: 200 bytes
Sum of squares 0-1000: 333833500
First multiple of 7:   7

12.2.3. Practical Use Cases#

12.2.3.1. Infinite sequences#

Generators can represent sequences that have no end — you just take as many values as you need.

import itertools

def fibonacci():
    """Infinite Fibonacci sequence generator."""
    a, b = 0, 1
    while True:
        yield a
        a, b = b, a + b

# Take only the values you need — never computes the rest
fib = fibonacci()
first_10 = list(itertools.islice(fib, 10))
print("First 10 Fibonacci:", first_10)  # [0, 1, 1, 2, 3, 5, 8, 13, 21, 34]

# Find the first Fibonacci number > 1000
fib = fibonacci()
big = next(f for f in fib if f > 1000)
print("First Fibonacci > 1000:", big)   # 1597
First 10 Fibonacci: [0, 1, 1, 2, 3, 5, 8, 13, 21, 34]
First Fibonacci > 1000: 1597
### Exercise: Running Total Generator
#   1. Write a generator function `running_total(numbers)` that yields the
#      cumulative sum after each element.
#      Example: running_total([1, 4, 2, 8]) yields 1, 5, 7, 15.
#   2. Write a generator expression `even_squares_gen` that yields the squares
#      of even numbers from 0 to 19 (i.e., 0, 4, 16, 36, 64, 100, ..., 324).
### Your code starts here.




### Your code ends here.
### Solution
def running_total(numbers):
    total = 0
    for n in numbers:
        total += n
        yield total

print(list(running_total([1, 4, 2, 8])))   # [1, 5, 7, 15]
print(list(running_total([10, -3, 7])))    # [10, 7, 14]

even_squares_gen = (x ** 2 for x in range(20) if x % 2 == 0)
print(list(even_squares_gen))   # [0, 4, 16, 36, 64, 100, 144, 196, 256, 324]
[1, 5, 7, 15]
[10, 7, 14]
[0, 4, 16, 36, 64, 100, 144, 196, 256, 324]

12.2.4. yield from#

yield from delegates yielding to another iterable (including another generator). It is essentially a shorthand for for item in sub: yield item, but also properly forwards send() and throw() calls to the sub-generator.

def flatten(nested):
    """Recursively flatten a nested list structure."""
    for item in nested:
        if isinstance(item, list):
            yield from flatten(item)  # delegate to recursive call
        else:
            yield item

data = [1, [2, 3], [4, [5, 6]], 7]
print(list(flatten(data)))   # [1, 2, 3, 4, 5, 6, 7]
[1, 2, 3, 4, 5, 6, 7]
# yield from with multiple sub-generators
def combined(*iterables):
    for it in iterables:
        yield from it

result = list(combined([1, 2], 'ab', range(3)))
print(result)   # [1, 2, 'a', 'b', 0, 1, 2]
[1, 2, 'a', 'b', 0, 1, 2]

12.2.5. More itertools#

The itertools module provides iterator building-blocks. Beyond islice and chain (already covered), three more are especially useful in data processing.

Function

Description

Example

count(start, step)

Infinite counter

count(1, 2) → 1, 3, 5, …

cycle(iterable)

Repeats iterable endlessly

cycle('AB') → A, B, A, B, …

combinations(it, r)

All r-length combos (no repeat)

combinations('ABC', 2) → AB AC BC

permutations(it, r)

All ordered r-length arrangements

permutations('AB', 2) → AB BA

import itertools

# count: infinite arithmetic sequence
evens = itertools.islice(itertools.count(0, 2), 5)
print('First 5 even numbers:', list(evens))  # [0, 2, 4, 6, 8]

# cycle: round-robin scheduling
servers = ['A', 'B', 'C']
requests = list(itertools.islice(itertools.cycle(servers), 7))
print('Request routing:', requests)   # ['A', 'B', 'C', 'A', 'B', 'C', 'A']
First 5 even numbers: [0, 2, 4, 6, 8]
Request routing: ['A', 'B', 'C', 'A', 'B', 'C', 'A']
import itertools

# combinations: choose without caring about order
teams = list(itertools.combinations(['Alice', 'Bob', 'Carol'], 2))
print('Possible pairs:', teams)
# [('Alice', 'Bob'), ('Alice', 'Carol'), ('Bob', 'Carol')]

# permutations: order matters (e.g. race positions)
podium = list(itertools.permutations(['Gold', 'Silver', 'Bronze'], 2))
print(f'Top-2 orderings: {len(podium)} total')
print(podium[:3])   # first 3 arrangements
Possible pairs: [('Alice', 'Bob'), ('Alice', 'Carol'), ('Bob', 'Carol')]
Top-2 orderings: 6 total
[('Gold', 'Silver'), ('Gold', 'Bronze'), ('Silver', 'Gold')]
### Exercise: itertools
#   1. Use itertools.count() and islice() to generate the first 8
#      multiples of 7 (7, 14, 21, ..., 56).
#   2. Use itertools.cycle() and islice() to simulate dealing cards
#      to 4 players, 3 cards each (total 12 cards from an infinite deck).
#      Players are 'North', 'East', 'South', 'West'.
#   3. Use itertools.combinations() to find all 3-letter combinations
#      from 'ABCDE' and count them.
import itertools

### Your code starts here.



### Your code ends here.
### Solution
import itertools

# 1. First 8 multiples of 7
multiples = list(itertools.islice(itertools.count(7, 7), 8))
print(multiples)   # [7, 14, 21, 28, 35, 42, 49, 56]

# 2. Deal 3 cards each to 4 players
players = itertools.cycle(['North', 'East', 'South', 'West'])
cards   = range(1, 13)  # card values 1-12
deal    = list(zip(itertools.islice(players, 12), cards))
print(deal)
# [('North', 1), ('East', 2), ('South', 3), ('West', 4),
#  ('North', 5), ('East', 6), ...]

# 3. Count 3-letter combinations from 'ABCDE'
combos = list(itertools.combinations('ABCDE', 3))
print(f'{len(combos)} combinations')   # 10
print(combos)
[7, 14, 21, 28, 35, 42, 49, 56]
[('North', 1), ('East', 2), ('South', 3), ('West', 4), ('North', 5), ('East', 6), ('South', 7), ('West', 8), ('North', 9), ('East', 10), ('South', 11), ('West', 12)]
10 combinations
[('A', 'B', 'C'), ('A', 'B', 'D'), ('A', 'B', 'E'), ('A', 'C', 'D'), ('A', 'C', 'E'), ('A', 'D', 'E'), ('B', 'C', 'D'), ('B', 'C', 'E'), ('B', 'D', 'E'), ('C', 'D', 'E')]

12.2.6. Bridge: Async Iteration#

Synchronous generators pause at yield and resume when next() is called — all within a single thread. Async generators do the same thing, but they can also pause to wait for I/O (a network response, a database query, a file read) without blocking other work.

The pattern mirrors what you’ve seen, but uses async def with yield, and requires async for to consume:

import asyncio

async def async_count(n):
    """Yields 0..n-1, simulating a pause between each."""
    for i in range(n):
        await asyncio.sleep(0)   # yields control to the event loop
        yield i

async def main():
    async for value in async_count(5):
        print(value)

asyncio.run(main())   # 0 1 2 3 4

Key vocabulary for future study:

Term

Meaning

async def

Declares a coroutine (or async generator if it uses yield)

await

Pauses the coroutine until the awaited task is ready

async for

Iterates over an async iterable; releases control between items

Event loop

The scheduler that runs coroutines concurrently

You don’t need to master asyncio now. The takeaway: the iterator protocol you learned here — __iter__/__next__, generators, yield — is exactly the foundation Python’s async system is built on.

12.2.7. Summary#

Concept

Key idea

Generator function

Uses yield; returns a generator object; body runs lazily

yield

Pauses function, returns a value; resumes on next next() call

Generator expression

(expr for item in seq if cond) — lazy, memory-efficient

Infinite generator

Uses while True: yield ...; take values with itertools.islice or next()

Memory efficiency

Generators produce one value at a time — constant memory regardless of sequence length

itertools.islice

Takes the first N values from any iterator