8.2. Sets#
8.2.1. What is a set#
A set is an unordered collection of unique values. Unlike lists or tuples, sets do not allow duplicates, and the elements have no guaranteed order.
Here below is an example of a set. Note that it uses curly braces just like dictionaries.
s = {1, 2, 3, 3, 2}
print(s) ### {1, 2, 3}
{1, 2, 3}
Sets are useful when you care about membership: whether something is in a collection; rather than order or position.
The most common use cases are:
Removing duplicates from a list
Checking membership quickly
Comparing two collections (what’s shared, what’s different)
For sets, the four key properties that make sets different from other data structures:
Unordered: Elements have no guaranteed position. There is no first or last element.
Unique: Duplicates are automatically removed.
Mutable: You can add and remove elements.
Hashable elements only: Every element must be hashable. In practice this means immutable types:
int,float,str,tuple.
### 1. Sets are unordered, so the order of elements is not guaranteed.
{3, 1, 2} == {1, 2, 3} # True
True
### 2. Sets do not allow duplicate elements, so duplicates are automatically removed.
{1, 2, 2, 3, 3} # {1, 2, 3}
{1, 2, 3}
### 3. Sets are mutable, so you can add or remove elements after the set is created.
s = {1, 2, 3}
s.add(4) # works
s.remove(2) # works
s
{1, 3, 4}
%%expect TypeError
### 4. Sets can contain only hashable (immutable) elements, so you cannot include lists or other sets as elements.
{1, "hello", (1, 2)} # valid
{[1, 2], [3, 4]} # TypeError
TypeError: unhashable type: 'list'
8.2.2. Choosing the Right Structure#
Need |
Use |
|---|---|
Ordered collection, duplicates allowed |
|
Fixed data, multiple return values |
|
Key-value lookup |
|
Unique items, membership testing |
|
Hashable set (dict key, nested set) |
|
8.2.3. Hashable vs mutable#
We have seen that some data types are mutable and some are hashable. It’s now time to learn a little more about the.
Hashable means an object has a hash value that stays stable during its lifetime.
Mutable objects usually are not hashable (for example:
list,dict,set).Immutable objects are often hashable (for example:
int,float,str,tuple), but for tuples, all nested elements must also be hashable.For hash-table types:
dictrequires hashable keys, andsetrequires hashable elements.
8.2.3.1. Mutability#
For the property of mutability, let’s have a look over the general data structures. For in-place change methods for the common data structures, here is a comparison table. Tuple and strings are immutable, so there’s no methods available for the purpose.
Method |
List |
Dict |
Set |
Tuple |
String |
|---|---|---|---|---|---|
|
✓ |
||||
|
✓ |
||||
|
✓ |
||||
|
✓ |
✓ |
|||
|
✓ |
✓ |
|||
|
✓ |
✓ |
✓ |
||
|
✓ |
||||
|
✓ |
||||
|
✓ |
||||
|
✓ |
||||
|
✓ |
||||
|
✓ |
||||
|
✓ |
✓ |
✓ |
||
|
✓ |
||||
|
✓ |
||||
|
✓ |
8.2.4. In-place set operations#
In Python, mutable methods that modify in-place typically return None. The general rule in Python is:
Mutates in-place → returns
NoneReturns a new object → original unchanged
These methods modify the original set instead of creating a new one:
x = {1, 2, 3}
y = {3, 4, 5}
x.update(y) # union in-place
print("update:", x)
x = {1, 2, 3}
x.intersection_update(y)
print("intersection_update:", x)
x = {1, 2, 3}
x.difference_update(y)
print("difference_update:", x)
x = {1, 2, 3}
x.symmetric_difference_update(y)
print("symmetric_difference_update:", x)
update: {1, 2, 3, 4, 5}
intersection_update: {3}
difference_update: {1, 2}
symmetric_difference_update: {1, 2, 4, 5}
8.2.4.1. Hashability#
Let us compare set, list, and dict and this time focus on hashability.
Feature |
Set |
List |
Dict |
|---|---|---|---|
Ordered |
No |
Yes |
Yes (insertion order) |
Duplicates |
No |
Yes |
Keys: No, Values: Yes |
Mutable |
Yes |
Yes |
Yes |
Allows only hashable elements |
Yes |
No |
Keys only |
Lookup speed |
Avg O(1) |
O(n) |
Avg O(1) |
Use case |
Unique items, membership |
Ordered items |
Key-value pairs |
In computer science, hashable means that an object can be converted into a fixed-size integer (called a hash value or hash code) via a hash function. This hash value is used to quickly locate the object in data structures like hash tables, dictionaries, and sets. For something to be hashable, it generally needs to satisfy two properties:
Consistency: the hash value must not change during the object’s lifetime. This is why mutable objects (like Python lists) are typically not hashable because if their contents change, their hash would need to change too, which breaks lookups.
Equality consistency: if two objects are considered equal (
a == b), they must have the same hash value. The reverse isn’t required; two different objects can share a hash (called a collision), but equal objects must hash identically.
Hash-based data structures like hash maps and hash sets rely on hash values to place and find items in O(1) average time. If you want to use an object as a dictionary key or store it in a set, it needs to be hashable.
In summary, hashable means you can be consistently fingerprinted with a number, which is the prerequisite for being usable as a key in fast lookup structures.
key → hash function → index → bucket in array
"cat" → hash("cat") → 42 → array[42]
print(hash("cat")) ### -9223372036854775808
print(hash("dog")) ### -867955804616931492
print(hash("cat")) ### -9223372036854775808
6620930103361482115
7788098004283485152
6620930103361482115