9.3. Advanced OOP Topics#

Hide code cell source

import sys
from pathlib import Path

current = Path.cwd()
for parent in [current, *current.parents]:
    if (parent / '_config.yml').exists():
        project_root = parent
        break
else:
    project_root = Path.cwd().parent.parent

sys.path.insert(0, str(project_root))

from shared import thinkpython, diagram, jupyturtle, download

sys.modules['thinkpython'] = thinkpython
sys.modules['diagram'] = diagram
sys.modules['jupyturtle'] = jupyturtle
sys.modules['download'] = download

This notebook covers five topics that extend the core OOP material:

Topic

What it adds

Comparison dunder methods

Value equality, ordering, and hash behavior for custom objects

Operator overloading

Making custom objects work with operators like + and *

@dataclass in depth

Auto-generated __init__, __repr__, __eq__, ordering, frozen instances

Class vs. instance variables

A common source of bugs, explained clearly

Static and class methods

Utility behavior and alternative constructors

9.3.1. Comparison Dunder Methods#

By default, == on a custom object tests identity (same as is), not value equality. Defining __eq__ changes that behavior to value comparison.

class Point:
    def __init__(self, x, y):
        self.x = x
        self.y = y

    def __repr__(self):
        return f'Point({self.x}, {self.y})'

p1 = Point(1, 2)
p2 = Point(1, 2)

print(p1 == p2)   # False — identity check by default
print(p1 is p2)   # False
False
False

Adding __eq__ makes == compare by value instead.

class Point:
    def __init__(self, x, y):
        self.x = x
        self.y = y

    def __repr__(self):
        return f'Point({self.x}, {self.y})'

    def __eq__(self, other):
        if not isinstance(other, Point):
            return NotImplemented
        return self.x == other.x and self.y == other.y

p1 = Point(1, 2)
p2 = Point(1, 2)
p3 = Point(3, 4)

print(p1 == p2)   # now True
print(p1 is p2)   # still False
print(p1 == p3)   # False

# print(p1)
True
False
False

9.3.1.1. Ordering with __lt__#

To support sorting (sorted(), min(), max()), define __lt__ (less than). You don’t need to write all six comparison methods by hand. The @functools.total_ordering decorator derives the missing ordering methods from two definitions you supply:

  1. __eq__: required; the decorator never derives it (equality has different semantics from ordering and is deliberately left to you).

  2. Any one of __lt__, __le__, __gt__, or __ge__: the decorator fills in the remaining three.

You define

Decorator derives

__eq__

(nothing; you must always supply this yourself)

one of __lt__ / __le__ / __gt__ / __ge__

the other three ordering methods

Using __lt__ is conventional: < reads naturally as “is a less than b?”, which maps cleanly onto sort order, but any of the four works.

For example, if you define __lt__, then a > b becomes b < a, and a >= b becomes not (a < b).

Performance note: because the derived methods add an extra function call layer, @total_ordering is marginally slower than writing all six methods explicitly. The difference is negligible in typical code; only consider hand-writing them if profiling shows comparison is a bottleneck in a tight inner loop.

from functools import total_ordering
import math

@total_ordering
class Point:
    def __init__(self, x, y):
        self.x = x
        self.y = y

    def __repr__(self):
        return f'Point({self.x}, {self.y})'

    def __eq__(self, other):
        if not isinstance(other, Point):
            return NotImplemented
        return self.x == other.x and self.y == other.y

    def __lt__(self, other):                        ### compare distances from the origin
        """Sort by distance from the origin."""
        if not isinstance(other, Point):
            return NotImplemented
        return math.hypot(self.x, self.y) < math.hypot(other.x, other.y)    

points = [Point(3, 4), Point(1, 1), Point(0, 2)]
print(sorted(points))   # sorted by distance from origin
print(min(points))

# __gt__ is derived automatically — no extra code needed
p_near = Point(1, 1)    # distance ≈ 1.41
p_far  = Point(3, 4)    # distance = 5.0

print(p_far > p_near)   # True  — total_ordering derives __gt__ from __lt__
print(p_near > p_far)   # False
[Point(1, 1), Point(0, 2), Point(3, 4)]
Point(1, 1)
True
False

9.3.1.2. Hashability and __hash__#

When you define __eq__, Python automatically sets __hash__ to None, making instances unhashable (cannot be used in sets or as dict keys). Define __hash__ explicitly to restore that ability.

# Rule of thumb: objects that compare equal must have the same hash.
def __hash__(self):
    return hash((self.x, self.y))

If your object is mutable, do not define __hash__; mutable objects should not be hashed because their value (and their hash) could change after insertion.

from functools import total_ordering
import math

@total_ordering
class Point:
    def __init__(self, x, y):
        self.x = x
        self.y = y

    def __repr__(self):
        return f'Point({self.x}, {self.y})'

    def __eq__(self, other):
        if not isinstance(other, Point):
            return NotImplemented
        return self.x == other.x and self.y == other.y

    def __lt__(self, other):
        if not isinstance(other, Point):
            return NotImplemented
        return math.hypot(self.x, self.y) < math.hypot(other.x, other.y)

    def __hash__(self):
        return hash((self.x, self.y))

p1 = Point(1, 2)
p2 = Point(1, 2)
p3 = Point(3, 4)

point_set = {p1, p2, p3}
print(point_set)          # p1 and p2 are equal — only one appears

lookup = {p1: 'origin-ish', p3: 'far'}
print(lookup[p2])         # works because p2 == p1 and hash(p2) == hash(p1)
{Point(1, 2), Point(3, 4)}
origin-ish

9.3.2. Operator Overloading#

By defining special methods, you can control how Python operators behave on your own types. For every operator there is a corresponding dunder method:

Operator

Method

Example

+

__add__

a + b

==

__eq__

a == b

<

__lt__

a < b

len()

__len__

len(a)

This section focuses on arithmetic operators; comparison operators (__eq__, __lt__, __hash__) are covered above.

Here is __add__ defined on BankAccount. When two accounts are added together, a new merged account is returned:

class BankAccount:
    """BankAccount with operator overloading."""

    def __init__(self, owner="Unknown", balance=0.0):
        self.owner = owner
        self._balance = balance

    @property
    def balance(self):
        return self._balance

    def __str__(self):
        return f"BankAccount(owner={self.owner}, balance=${self._balance:.2f})"

    def __repr__(self):
        return f"BankAccount('{self.owner}', {self._balance})"

    def __add__(self, other):
        """Merge two accounts into one."""
        if not isinstance(other, BankAccount):
            return NotImplemented
        merged_owner = f"{self.owner}&{other.owner}"
        merged_balance = self._balance + other._balance
        return BankAccount(merged_owner, merged_balance)
a = BankAccount("Ava", 12.00)
b = BankAccount("Ben", 8.00)
print(a + b)   # BankAccount(owner=Ava&Ben, balance=$20.00)
BankAccount(owner=Ava&Ben, balance=$20.00)

When Python evaluates a + b, it calls a.__add__(b) automatically. Changing the behavior of an operator so that it works with programmer-defined types is called operator overloading.

9.3.3. @dataclass#

@dataclass is a decorator from the standard library that auto-generates common dunder methods (__init__, __repr__, __eq__) for a class based on its annotated fields, eliminating boilerplate.

Dataclasses give type annotations a second job. In a regular class, name: str is mostly a hint for readers and tools. In a dataclass, an annotated class variable also becomes a field that @dataclass uses to generate __init__, __repr__, and comparison behavior.

Here we look at its most useful options.

Option

Effect

eq=True (default)

Auto-generates __eq__ based on fields

order=True

Also generates __lt__, __le__, __gt__, __ge__

frozen=True

Makes instances immutable; also enables __hash__

field(default_factory=...)

Safe default for mutable fields like lists

When order=True is set, fields are compared in declaration order — here name first, then gpa — so sorted([s1, s2]) places Alice before Bob because 'Alice' < 'Bob' alphabetically.

from dataclasses import dataclass, field

@dataclass(order=True)
class Student:
    name: str
    gpa: float
    courses: list[str] = field(default_factory=list)   # safe mutable default

s1 = Student('Alice', 3.8)
s2 = Student('Bob', 3.5)
s3 = Student('Alice', 3.8)

print(s1 == s3)          # True — same field values
print(sorted([s1, s2]))  # sorted lexicographically by (name, gpa)
s1.courses.append('CS101')
print(s1)
True
[Student(name='Alice', gpa=3.8, courses=[]), Student(name='Bob', gpa=3.5, courses=[])]
Student(name='Alice', gpa=3.8, courses=['CS101'])

9.3.3.1. Frozen Dataclasses#

frozen=True prevents attribute mutation after creation and automatically provides a correct __hash__, making instances usable in sets and as dict keys.

from dataclasses import dataclass

@dataclass(frozen=True)
class Color:
    r: int
    g: int
    b: int

red = Color(255, 0, 0)
print(red)
print(hash(red))   # hashable

palette = {red, Color(0, 255, 0), Color(0, 0, 255)}
print(palette)

try:
    red.r = 128    # raises FrozenInstanceError
except Exception as e:
    print(type(e).__name__, e)
Color(r=255, g=0, b=0)
4091835460043580556
{Color(r=0, g=255, b=0), Color(r=255, g=0, b=0), Color(r=0, g=0, b=255)}
FrozenInstanceError cannot assign to field 'r'

9.3.4. namedtuple#

namedtuple from collections creates a lightweight, immutable data class in one line. It gives you named fields (like a regular class) plus the efficiency and unpacking of a plain tuple. Use it for simple, read-only data containers that don’t need methods.

from collections import namedtuple

# namedtuple('ClassName', ['field1', 'field2', ...]) returns a new class
Point = namedtuple('Point', ['x', 'y'])

p = Point(3, 4)
print(p)              # Point(x=3, y=4)
print(p == Point(3, 4))   # True — value equality built in
Point(x=3, y=4)
True
from collections import namedtuple

# Access by attribute name, by index, or by tuple unpacking — all work
p = Point(3, 4)
print(p.x, p.y)     # by name
print(p[0], p[1])   # by index

x, y = p            # tuple unpacking
print(x, y)
3 4
3 4
3 4
# namedtuple instances are immutable — fields cannot be changed after creation
try:
    p.x = 10
except AttributeError as e:
    print(e)

# But you can create a modified copy with _replace()
p2 = p._replace(x=10)
print(p2)   # Point(x=10, y=4)
can't set attribute
Point(x=10, y=4)
# namedtuple vs @dataclass — when to use which
from dataclasses import dataclass

# namedtuple: one line, immutable, no boilerplate — good for simple data records
Player = namedtuple('Player', ['name', 'score'])
p1 = Player('Alice', 95)
print(p1)   # Player(name='Alice', score=95)

# @dataclass: mutable by default, supports methods, type annotations
@dataclass
class PlayerDC:
    name: str
    score: float

p2 = PlayerDC('Alice', 95)
p2.score = 100   # mutation allowed
print(p2)        # PlayerDC(name='Alice', score=100)
Player(name='Alice', score=95)
PlayerDC(name='Alice', score=100)
### Exercise: namedtuple
# 1. Define a namedtuple called Employee with fields: name, department, salary.
# 2. Create two Employee instances.
# 3. Access the salary of the second employee by attribute name and by index.
# 4. Try to change a field and confirm it raises an AttributeError.
### Your code starts here.



### Your code ends here.
### solution
from collections import namedtuple

Employee = namedtuple('Employee', ['name', 'department', 'salary'])

e1 = Employee('Alice', 'Engineering', 95000)
e2 = Employee('Bob', 'Marketing', 78000)

print(e1)
print(e2.salary)    # by attribute name
print(e2[2])        # by index — same field

try:
    e1.salary = 100000
except AttributeError as e:
    print(e)
Employee(name='Alice', department='Engineering', salary=95000)
78000
78000
can't set attribute

9.3.5. Class vs. Instance Variables#

A class variable is defined directly in the class body, outside any method. It is shared across all instances. An instance variable is set on self inside a method and belongs only to that one object.

Confusing the two is one of the most common OOP bugs in Python.

class Dog:
    species = 'Canis lupus familiaris'   # class variable — shared by all dogs

    def __init__(self, name):
        self.name = name                  # instance variable — unique per dog

d1 = Dog('Rex')
d2 = Dog('Fido')

print(d1.species)    # 'Canis lupus familiaris'
print(d2.species)    # same — shared
print(d1.name)       # 'Rex'
print(d2.name)       # 'Fido'
Canis lupus familiaris
Canis lupus familiaris
Rex
Fido

9.3.5.1. The Mutation Trap#

Assigning to a class variable via an instance creates a new instance variable that shadows the class variable — it does not change the class variable for all instances.

But mutating a mutable class variable (like a list) does affect all instances, because no new variable is created.

class Counter:
    count = 0            # class variable
    history = []         # mutable class variable — danger zone

    def __init__(self, name):
        self.name = name

a = Counter('a')
b = Counter('b')

# Reassignment via instance — creates a new instance variable on `a` only
a.count = 99
print(a.count)           # 99  — instance variable on a
print(b.count)           # 0   — class variable unchanged
print(Counter.count)     # 0

# Mutation via instance — modifies the shared class-level list
a.history.append('event')
print(b.history)         # ['event'] — b sees the change!
print(Counter.history)   # ['event']
99
0
0
['event']
['event']

Rule of thumb:

  • Use class variables for constants or data that truly belongs to the class (e.g., species, MAX_SIZE).

  • Use instance variables (set in __init__) for data that belongs to individual objects.

  • Never use a mutable class variable as a default container — use field(default_factory=list) with @dataclass, or set the list in __init__.

9.3.6. Static and Class Methods#

Not every method needs an instance. Python provides two decorators for methods that are attached to the class itself rather than an instance:

Decorator

First parameter

Typical use

@staticmethod

(none)

Utility function logically grouped with the class

@classmethod

cls (the class itself)

Alternative constructors / factory methods

Both decorators come up naturally alongside class variables, because all three belong to the class rather than any one instance.

9.3.6.1. Static Methods#

A static method is a regular function that lives inside a class for organizational reasons. It receives neither self nor cls, so it cannot access instance or class state directly.

A common use-case is a validation or parsing helper that supports other methods without depending on object state:

class BankAccount:
    """Bank account used to demonstrate static and class methods."""

    def __init__(self, owner="Unknown", balance=0.0):
        self.owner = owner
        self._balance = balance

    @property
    def balance(self):
        return self._balance

    def deposit(self, amount):
        self._balance += amount
        return self

    def withdraw(self, amount):
        if amount > self._balance:
            raise ValueError("Insufficient funds")
        self._balance -= amount
        return self

    def __str__(self):
        return f"BankAccount(owner={self.owner}, balance=${self._balance:.2f})"

    def __repr__(self):
        return f"BankAccount('{self.owner}', {self._balance})"

    # -- static method ------------------------------------------------
    @staticmethod
    def parse_record(s):
        """Parse an 'owner:balance' string into raw values."""
        owner, balance = s.split(":")
        return owner, float(balance)

Because parse_record is a static method, it has no self or cls parameter. It is a utility helper that can be called on the class directly:

owner, balance = BankAccount.parse_record("Taylor:348.00")
acct = BankAccount(owner, balance)
print(acct)   # BankAccount(owner=Taylor, balance=$348.00)
BankAccount(owner=Taylor, balance=$348.00)

9.3.6.2. Class Methods#

A class method receives the class as its first argument (cls). This makes it better than a static method when subclasses are involved: cls(...) creates an instance of the actual subclass, not the hardcoded parent class.

Here is from_string rewritten as a class method:

class BankAccount:
    """Bank account — class-method version of from_string."""

    def __init__(self, owner="Unknown", balance=0.0):
        self.owner = owner
        self._balance = balance

    @property
    def balance(self):
        return self._balance

    def deposit(self, amount):
        self._balance += amount
        return self

    def withdraw(self, amount):
        if amount > self._balance:
            raise ValueError("Insufficient funds")
        self._balance -= amount
        return self

    def __str__(self):
        return f"BankAccount(owner={self.owner}, balance=${self._balance:.2f})"

    def __repr__(self):
        return f"BankAccount('{self.owner}', {self._balance})"

    # ── class method ────────────────────────────────────────────
    @classmethod
    def from_string(cls, s):
        """Create an instance from an 'owner:balance' string."""
        owner, balance = s.split(":")
        return cls(owner, float(balance))

    @classmethod
    def zero_balance(cls, owner):
        """Return an account with a zero balance."""
        return cls(owner, 0.0)
acct1 = BankAccount.from_string("Taylor:348.00")
print(acct1)  # BankAccount(owner=Taylor, balance=$348.00)
BankAccount(owner=Taylor, balance=$348.00)
# Alternative constructors — both work on subclasses automatically
acct1 = BankAccount.from_string("Casey:94.50")
acct2 = BankAccount.zero_balance("Rin")
print(acct1)   # BankAccount(owner=Casey, balance=$94.50)
print(acct2)   # BankAccount(owner=Rin, balance=$0.00)
BankAccount(owner=Casey, balance=$94.50)
BankAccount(owner=Rin, balance=$0.00)

When to use which:

@staticmethod

@classmethod

Receives class?

No

Yes (cls)

Subclass-safe?

No — hardcodes class name

Yes — cls(...) creates the right type

Typical use

Pure utility / validation helper

Alternative constructors

Prefer @classmethod for constructors; use @staticmethod only for helpers that truly need no access to the class.

### Exercise: Class Methods
#   Create a subclass ExtendedBankAccount(BankAccount) and add a class
#   method from_balance_str(cls, owner, s) that parses a dollar string
#   like "12.34" and returns a new account.
#   Test:
#       print(ExtendedBankAccount.from_balance_str("Kai", "12.34"))
### Your code starts here.


### Your code ends here.

Hide code cell source

### Solution
class ExtendedBankAccount(BankAccount):
    @classmethod
    def from_balance_str(cls, owner, s):
        """Create account from a balance string like '12.34'."""
        return cls(owner, float(s))

print(ExtendedBankAccount.from_balance_str("Kai", "12.34"))
# BankAccount(owner=Kai, balance=$12.34)
BankAccount(owner=Kai, balance=$12.34)

9.3.7. Summary#

Topic

Key takeaway

__eq__ / __lt__ / __hash__

Define these to make objects sortable and hashable; remember the mutability rule for __hash__

@dataclass

Use order=True for sorting, frozen=True for hashable immutable objects, field(default_factory=...) for mutable defaults

Class vs. instance variables

Keep mutable state in instance variables; treat class variables as shared constants

Static / class methods

@staticmethod for pure helpers; @classmethod for alternative constructors (subclass-safe)

Operator overloading

Define __add__, __eq__, etc. to give your objects natural operator syntax