Build Your Own Enum!

After programming in Java for altogether too long, I've developed a healthy appetite for the humble Enum.

There's a problem, though: Python's enum is ugly. Ugly as hell:

class Something(Enum):  
    x = 1
    y = 2
    z = 3

See the assignments? x = 1, y = 2 etc.? Yeah, those aren't optional.

And guess what? If you ever accidentally repeat a value, it creates something called an alias (otherwise known as a bug waiting to happen!) where w = 1 means that Something.w and Something.x not only compare equal, but point to the same object in memory:

>>> class Something(Enum):
...     w = 1
...     x = 1
...     y = 2
...     z = 3
... 
>>> Something.w
<Something.w: 1>  
>>> Something.x
<Something.w: 1>  

So what the hell: let's write a better one. I've got a much more full-featured version of this on Github, but we'll walk through a significantly shorter version.


First things first: let's set our sights on an end goal. How about this for starters?

>>> class Colour(Enum):
...     red, blue, green

Simple enough, right? Not quite Java's utilitarian enum X {}, but as close as we'll get without actually resorting to adding new syntax to Python, or some horrific function-based approach.

(Note to self: write the horrific function-based approach. Then blog about it.)

But hang on, isn't this invalid syntax in Python? We're not assigning those values to anything, and we can assume they don't exist already... so where are they coming from? Are they keywords, or something else magical?

Well, get ready for some hand-waving. To understand this, we have to grok that Python is based almost entirely around dictionaries. Objects are relatively thin wrappers around dicts, under the hood. So are namespaces. So are classes. So are modules... and, well, pretty much anything else you can think of.

But let's focus on classes for now. When Python's interpreter reads a class statement:

class X:  
    <class body>

It actually evaluates -- i.e. executes -- the class body, inside a new namespace. (Try it: put a call to print("Hello, world") inside a class body!) That new namespace, like pretty much everything else, is a dictionary. So the assignment statement here:

class Numbers:  
    x = 1

is roughly equivalent to Numbers["x"] = 1. (I did say there'd be some hand-waving!)

But how does this help us? Well, we've just demonstrated how you can set variables in the namespace of a class definition. So it follows that to get variables:

class Numbers:  
    x = 1
    y = x

must do something similar: Numbers["x"] = 1, and the important bit... Numbers["y"] = Numbers["x"].

To get a variable's value, we have to get it from the namespace -- which, remember, is a dictionary.

Now take a look at this code:

class Colours:  
    Red

This code fails, not because it's invalid syntax, but because we try and fail to locate the name Red anywhere. Python tries to get it from the Colours namespace first, but can't find it, so it moves onto the outer scopes.

The ideal solution for making code like this work would be to replace the dictionary namespace with something that always returns a value, regardless of whether it exists or not. The crucial point is to not cause an error.

Enter collections.defaultdict. It's a Python standard library data type that acts like a dict, but creates, stores and returns a new item for any missing keys.

Let's use an infinite count generator from itertools to generate a sequence of discrete values, and we can generate a namespace like so:

counter = itertools.count()

def get_next_number():  
    return next(counter)

namespace = collections.defaultdict(get_next_number)  

Of course, we could use a lambda for this too:

namespace = collections.defaultdict(lambda: next(counter))


Cool. So we've got a namespace that stores and returns values for everything. So how do we plug that into our Enum classes?

We're in luck: Python provides us with a way to alter how classes are created. Just like you can change a class to change the way an object behaves, you can change a metaclass to change the way a class behaves. We can specify that our class should instantiate a metaclass like so:

class Enum(metaclass=MetaEnum):  
    ...

So let's define our MetaEnum class:

class MetaEnum(type):  
   ...

Here's why we're inheriting from type: classes are by default instances of type. We want to change the way the class is created, so we're specialising type and writing our own method.

Python gives us a "magic method" for just what we need, and it's called __prepare__. In return for some information about the class we're creating (name, parent classes etc.), we're obliged to return the class's namespace. Let's use our code from earlier, and return our custom default dictionary:

class MetaEnum(type):  
    def __prepare__(name, bases, **kwargs):
        counter = itertools.count()

        def get_next_number():
            return next(counter)

        return collections.defaultdict(get_next_number)

We're nearly done! Let's just write a helpful Enum class so our end users never have to worry about metaclasses:

class Enum(metaclass=MetaEnum):  
    pass

And there we have it: a simple, integer-based Enum with much nicer syntax than the default Python version!

>>> class Colour(Enum):
...     red, blue, yellow
... 
>>> r = Colour.red
>>> b = Colour.blue
>>> y = Colour.yellow
>>> 
>>> r == Colour.red
True  
>>> b == Colour.yellow
False  

I've taken this premise and added many, many more cool tweaks into a project called magic-enum. Check it out if you have a minute!