Lazy Loading a Dict

Lazy Loading a Dict (published 2019-11-03)

Lazy loading a dictionary is pretty easy in theory, but can be harder to understand in practice.

The idea behind lazy-loading is that an expensive/resource-intensive value can be calculated when needed, not when the dictionary is initiated.

Misconception: `collections.defaultdict`

My first misconception was that you could do this with collections.defaultdict. After all, you pass it a function. However, it just calls the function without any arguments, making it unfit for our purpose.

Solution: subclass `collections.UserDict`

The solution, then, is to subclass a dictionary. However, subclassing dict alone is not a good solution, since your functions can go uncalled when the dict class calls a C function.

Luckily, the standard library gives us a class built for just this. It's the collections.UserDict class. Here's a basic implementation of a lazy loading dict type:

from collections import UserDict
class LazyLoadDict(UserDict):
    def __init__(self,func,initialdata=dict()):
        super().__init__(initialdata)
        self.func = func
    def __getitem__(self,key):
        if key not in self.data:
            self.data[key]=self.func(key)
            return self.data[key]
        return self.data[key]

However, this solution has a few drawbacks. For example, all values have to be calculated based on the key alone. Here's a slightly better version:

from collections import UserDict
class LazyLoadDict(UserDict):
    def __init__(self,func,initialdata=dict(),**kwargs):
        super().__init__(initialdata)
        self.func = func
        self.__dict__.update(kwargs)
    def __getitem__(self,key):
        if key not in self.data:
            self.data[key]=self.func(self,key)
            return self.data[key]
        return self.data[key]

By allowing the user to supply values to the dictionary, and giving a reference to the dictionary to the function, we let you give more info to the function. For example:

def grab(d,key):
    l = []
    for i in range(d.num): # waste time
        l.append(i)
        l.append(i**2)
        l.append(i**3)
    return key

test = LazyLoadDict(grab,dict(),num=1000)