How to achieve Partial Immutability with Python? dataclasses or attrs?

Should you use Python’s dataclasses or attrs? This article will give you an overview with examples.
python
dataclasses
attrs
Author

noklam

Published

April 22, 2022

TL;DR

This blog goes into detail with examples of using dataclasses and attrs, why and when you should consider to use it. This assume you already understand why dataclass and its variants are useful, so I am not trying to convince you that you should use dataclass, but WHICH libraries you may want to choose.

If you are looking for a quick summary:

Item dataclasses attrs
Immutable Instance @dataclass(frozen=True) @define(frozen=True)
Immutable Field
Derived Attributes
Derived Attributes + Immutability
Dependencies ✅ standard library ✅ almost zero dependency

Immutable Instance

With dataclasses, you can set frozen=True to ensure immutablilty. It throws an FrozenInstanceError when someone is trying to update an immutable object.

from dataclasses import dataclass

@dataclass(frozen=True)
class FrozenDataClass:
    a: int
    b: int

frozen = FrozenDataClass(1,2)
frozen.c = 3
FrozenInstanceError: cannot assign to field 'c'

With attrs, it’s mostly identical except that you use @define(frozen=True).

from attrs import define

@define(frozen=True)
class FrozenAttrs:
    b: int

frozen = FrozenAttrs(1)
frozen.c = 3
FrozenInstanceError: 

post_init assignment and Derived Attributes

Derived Attributes

Sometimes attribute are not defined during initialisation, but derived from other attribtues.

@dataclass
class DataClass:
    a: int
    b: int

    def __post_init__(self):
        self.c = self.a + self.b
frozen = DataClass(1,2)
print(frozen.c)
3

Similarly, with attrs:

from attrs import define, field

@define
class AttrsDataClass:
    a: int
    b: int
    c: int = field(init=False) # derived

    def __attrs_post_init__(self):
        self.c = self.a + self.b

attrs_dc = AttrsDataClass(1,2)
print(attrs_dc.c)
3

Partial Immutable Data Class at field level

dataclasses does not have this flexibility. Here is an example with attrs:

from attrs import define, field
from attrs import setters

@define
class AttrsDataClass:
    a: int
    b: int = field(on_setattr=setters.frozen)
    c: int = field(init=False) # derived

    def __attrs_post_init__(self):
        self.c = self.a + self.b

attrs_dc = AttrsDataClass(1,2)
attrs_dc.a = 1 # OK
attrs_dc.c = 2 # OK
attrs_dc.b = 3 # Not OK
FrozenAttributeError: 

Now you get a new FrozenAttributeError error. What if you want to set attributes on a frozen class?

dataclass post_init assignment in a frozen dataclass ✾

For those of you thinking about using derived attribute with dataclass, it doesn’t work.

@dataclass(frozen=True)
class FrozenDataClass:
    a: int
    b: int

    def __post_init__(self):
        self.c = self.a + self.b
frozen = FrozenDataClass(1,2)
FrozenInstanceError: cannot assign to field 'c'

It doesn’t work! Because the frozen flag will block any assignment even in the __post_init__ method assignment too.

The object.__setattr__ trick

All Python objects are just regular objects, thus they aren’t truely “immutable”. Most of the time, the libraries achieve the immutability via implementing the __setattr__ method.

@define(frozen=True)
class FrozenAttrs:
  a: int

frozen_class = FrozenAttrs(1)
frozen_class.a = 3
FrozenInstanceError: 

It may seems like it is indeed immutable, but if you try hard enough you can always crack it.

object.__setattr__(frozen_class, "a", 100)
frozen_class.a
100

The object class is almost like the parent of all class. So that even though frozen_class.__setattr__ works fine, you can still by pass this via this trick. In theory, you could also use this trick to achieve partial immutability with dataclasses.

@dataclass(frozen=True)
class FrozenDataClass:
    a: int
    b: int

    def __post_init__(self):
        object.__setattr__(self, 'c', self.a + self.b)

frozen = FrozenDataClass(1,2)
frozen.a, frozen.b, frozen.c
(1, 2, 3)

Derived Attribute + Immutability

We learnt that the frozen dataclass doesn’t work well with derived attributes with dataclasses. This is so common and probably easier to achieve via the good old @property. Does that mean dataclass are not useful? This is something that I found unclear when reading through the docs. Luckily attrs has a solution to this too:

import attrs
from attrs import define

@define(frozen=True)
class FrozenDerivedAttrs:
    a: int
    b: int
    c: int = field(init=False)

    @c.default
    def _default_value(self):
         return self.a + self.b

obj = FrozenDerivedAttrs(1,2)
obj.c
3

The above method is more natural way of writing Python class, but there is another approach that are usually easier to test. Essentially, you use factory method to produce an immutable class.

@define(frozen=True)
class FrozenDerivedAttrs:
    a: int
    b: int
    derived: int

    @classmethod
    def from_args(cls, a,b):
        return cls(a,b, a+b)

obj = FrozenDerivedAttrs.from_args(1,2)
obj.derived
3

Conclusion

attrs offers a lot more flexibility compare to dataclasses, from frozen class, frozen field, derived attributes and a combination of them (there are a lot more, you should check out attrs by Example). You may be able to achieve similar thing by using the obejct.__setattr__ trick, but I’d also argue if you are trying so hard to fight with the library, you probably shouldn’t use it. I do feel that when I am writing class with attrs it feels slightly different in the beginning, but they also teach you how you should write your data class in the long run.