Python 101: `defaultdict(set)`

defaultdict(set) is a special data structure from Python’s collections module that creates a dictionary with default values as sets.

1. Introduction to defaultdict

  • defaultdict is a variant of a regular dictionary in Python, provided by the collections module.
  • When you try to access a non-existent key, it won’t raise a KeyError; instead, it will automatically create a default value for that key.
  • The default value is provided by the default_factory parameter, such as list, int, set, etc.

2. Using defaultdict(set)

Definition

defaultdict(set) creates a dictionary where the default value is an empty set.

How it works

  • When you try to access a key that doesn’t exist, defaultdict will automatically create an empty set for that key and set it as the default value.
  • This is different from a regular dictionary, where accessing a non-existent key would raise a KeyError.

3. Example Code

from collections import defaultdict

# Create a defaultdict with default values as empty sets
d = defaultdict(set)

# Add elements
d['a'].add(1)  # Automatically creates an empty set for 'a' and adds 1
d['a'].add(2)
d['b'].add(3)  # Automatically creates an empty set for 'b' and adds 3

print(d)  # Output: defaultdict(<class 'set'>, {'a': {1, 2}, 'b': {3}})

# Access a non-existent key
print(d['c'])  # Output: set()

4. When to Use defaultdict(set)

defaultdict(set) is useful in the following scenarios:

1. Adjacency List Representation for Graphs

In graph algorithms, adjacency lists are commonly used to represent graph edges. Using defaultdict(set) makes it easy to build directed or undirected graphs.

from collections import defaultdict

graph = defaultdict(set)
edges = [(1, 2), (2, 3), (3, 4), (4, 1)]

for u, v in edges:
    graph[u].add(v)
    graph[v].add(u)  # For an undirected graph

print(graph)
# Output: defaultdict(<class 'set'>, {1: {2, 4}, 2: {1, 3}, 3: {2, 4}, 4: {1, 3}})

2. Grouping or Associating Elements

For example, if you need to group elements under a key, you can use defaultdict(set) to automatically create an empty set and add related elements.

from collections import defaultdict

d = defaultdict(set)
data = [('apple', 'fruit'), ('carrot', 'vegetable'), ('banana', 'fruit')]

for item, category in data:
    d[category].add(item)

print(d)
# Output: defaultdict(<class 'set'>, {'fruit': {'apple', 'banana'}, 'vegetable': {'carrot'}})

5. defaultdict(set) vs. Regular dict

Feature defaultdict(set) Regular dict
Default value Automatically creates an empty set Raises KeyError when accessing a non-existent key
Use case Ideal when you need to initialize sets automatically Requires manual initialization of key values
Common scenarios Adjacency lists, grouping elements, tracking associations Storing and retrieving key-value pairs

Summary: defaultdict(set) is a concise and convenient way to automatically create sets when handling dictionaries, avoiding manual checks for key existence, making the code more clean and readable.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *