Combine multiple dict factories in asdict

Python data classes were introduced in [peps.python.org/pep-0557/](PEP 557). They allow us to model data cleaner than commonly used dicts. Unlike classic python classes, data classes are easily transformable to a dictionary - you can use the asdict method from the dataclasses package.

Asdict

from dataclasses import dataclass, asdict


@dataclass(frozen=True)
class DataclassExample:
    field1: str
    field2: int

my_instance = DataclassExample("str", 42)

print(asdict(my_instance))

# => {'field1': 'str', 'field2': 42}

If we add a datetime field

from dataclasses import dataclass, asdict
from datetime import datetime
import json


@dataclass(frozen=True)
class DataclassExample:
    field1: str
    field2: int
    field3: datetime

my_instance = DataclassExample("str", 42, datetime(2022, 2, 14))

# => {'field1': 'str', 'field2': 42, 'field3': datetime.datetime(2022, 2, 14, 0, 0)}

The problem starts, when we want to serialize our data class to JSON

json.dumps(asdict(my_instance))

It raises TypeError: Object of type datetime is not JSON serializable. We can adjust it manually by changing the dictionary or writing our own method, which converts given instances to a dictionary, but it's just one field. There must be a better way!

Fortunately, you can pass dict_factory to asdict method as a second parameter. It's a method that accepts a list of tuples and should return a dictionary. For example, if we want to convert all datetime fields to timestamps, we can use a method like this

def datetimes_as_timestamps_factory(data: List[Tuple[str, Any]]):
    return {
        field: value.timestamp() if isinstance(value, datetime) else value
        for field, value in data
        }

We can pass it to asdict method like this:

my_instance = DataclassExample("str", 42, datetime(2022, 2, 14))

def datetimes_as_timestamps_factory(data: List[Tuple[str, Any]]):
    return {
        field: value.timestamp() if isinstance(value, datetime) else value
        for field, value in data
        }

json.dumps(asdict(my_instance, dict_factory=datetimes_as_timestamps_factory))

# => {"field1": "str", "field2": 42, "field3": 1644793200.0}

Yay! No more TypeError!

Now, if we add a new UUID field to our dataclass

@dataclass(frozen=True)
class DataclassExample2:
    field1: str
    field2: int
    field3: datetime
    field4: UUID

We land in a very similar situation like with datetime

my_instance2 = DataclassExample2("str", 42, datetime(2022, 2, 14), UUID("46d7a9c2-e46c-4219-a148-c98339fa2808"))
print(json.dumps(asdict(my_instance2, dict_factory=datetimes_as_timestamps_factory))) # raises TypeError: Object of type UUID is not JSON serializable

Creating a UUID factory is straightforward.

def uuid_as_strings_factory(data: List[Tuple[str, Any]]):
    return {
        field: str(value) if isinstance(value, UUID) else value
        for field, value in data
    }

But how can we pass a factory that can convert both datetimes and uuids simultaneously?

Combining asdict factories

The simplest way to combine factories together is simple function composition. Unfortunately, our functions accept a list of tuples, but they return dictionaries. We need to do a small trick to nicely combine them

my_instance2 = DataclassExample2("str", 42, datetime(2022, 2, 14), UUID("46d7a9c2-e46c-4219-a148-c98339fa2808"))
json.dumps(asdict(my_instance2, dict_factory=lambda x: datetimes_as_timestamps_factory(uuid_as_strings_factory(x).items())))
# => {"field1": "str", "field2": 42, "field3": 1644793200.0, "field4": "46d7a9c2-e46c-4219-a148-c98339fa2808"}

But it's cumbersome to go function by function and apply them manually. Fortunately for us, there is a solution, we can use reduce function from functools to combine our factories

def compose_factories(functions: List[Callable[[List[Tuple[str, Any]]], Dict[str, Any]]]):
    return functools.reduce(lambda f, g: lambda x: f(g(x).items()), functions)

print(json.dumps(asdict(my_instance2, dict_factory=compose_factories([datetimes_as_timestamps_factory, uuid_as_strings_factory]))))

Conclusion

When serializing your data classes to dict, you can pass a special factory, which allows you to transform data according to your need. With functools.reduce, you can combine an array of functions, to transform different data types. I hope this will come in handy!