Combine multiple dict factories in asdict
Python data classes were introduced in [peps.python.org/pep-0557/](PEP 557). They allow us to model data cleaner than commonly used dicts.
Unlike classic python classes, data classes are easily transformable to a dictionary - you can use the asdict
method from the dataclasses
package.
Asdict
from dataclasses import dataclass, asdict
@dataclass(frozen=True)
class DataclassExample:
field1: str
field2: int
my_instance = DataclassExample("str", 42)
print(asdict(my_instance))
# => {'field1': 'str', 'field2': 42}
If we add a datetime field
from dataclasses import dataclass, asdict
from datetime import datetime
import json
@dataclass(frozen=True)
class DataclassExample:
field1: str
field2: int
field3: datetime
my_instance = DataclassExample("str", 42, datetime(2022, 2, 14))
# => {'field1': 'str', 'field2': 42, 'field3': datetime.datetime(2022, 2, 14, 0, 0)}
The problem starts, when we want to serialize our data class to JSON
json.dumps(asdict(my_instance))
It raises TypeError: Object of type datetime is not JSON serializable
. We can adjust it manually by changing the dictionary or writing
our own method, which converts given instances to a dictionary, but it's just one field. There must be a better way!
Fortunately, you can pass dict_factory
to asdict
method as a second parameter. It's a method that accepts a list of tuples and should return a dictionary. For example, if we want to convert all datetime
fields to timestamps, we can use a method like this
def datetimes_as_timestamps_factory(data: List[Tuple[str, Any]]):
return {
field: value.timestamp() if isinstance(value, datetime) else value
for field, value in data
}
We can pass it to asdict
method like this:
my_instance = DataclassExample("str", 42, datetime(2022, 2, 14))
def datetimes_as_timestamps_factory(data: List[Tuple[str, Any]]):
return {
field: value.timestamp() if isinstance(value, datetime) else value
for field, value in data
}
json.dumps(asdict(my_instance, dict_factory=datetimes_as_timestamps_factory))
# => {"field1": "str", "field2": 42, "field3": 1644793200.0}
Yay! No more TypeError!
Now, if we add a new UUID field to our dataclass
@dataclass(frozen=True)
class DataclassExample2:
field1: str
field2: int
field3: datetime
field4: UUID
We land in a very similar situation like with datetime
my_instance2 = DataclassExample2("str", 42, datetime(2022, 2, 14), UUID("46d7a9c2-e46c-4219-a148-c98339fa2808"))
print(json.dumps(asdict(my_instance2, dict_factory=datetimes_as_timestamps_factory))) # raises TypeError: Object of type UUID is not JSON serializable
Creating a UUID factory is straightforward.
def uuid_as_strings_factory(data: List[Tuple[str, Any]]):
return {
field: str(value) if isinstance(value, UUID) else value
for field, value in data
}
But how can we pass a factory that can convert both datetimes and uuids simultaneously?
Combining asdict factories
The simplest way to combine factories together is simple function composition. Unfortunately, our functions accept a list of tuples, but they return dictionaries. We need to do a small trick to nicely combine them
my_instance2 = DataclassExample2("str", 42, datetime(2022, 2, 14), UUID("46d7a9c2-e46c-4219-a148-c98339fa2808"))
json.dumps(asdict(my_instance2, dict_factory=lambda x: datetimes_as_timestamps_factory(uuid_as_strings_factory(x).items())))
# => {"field1": "str", "field2": 42, "field3": 1644793200.0, "field4": "46d7a9c2-e46c-4219-a148-c98339fa2808"}
But it's cumbersome to go function by function and apply them manually. Fortunately for us, there is a solution, we can use reduce
function from functools
to combine our factories
def compose_factories(functions: List[Callable[[List[Tuple[str, Any]]], Dict[str, Any]]]):
return functools.reduce(lambda f, g: lambda x: f(g(x).items()), functions)
print(json.dumps(asdict(my_instance2, dict_factory=compose_factories([datetimes_as_timestamps_factory, uuid_as_strings_factory]))))
Conclusion
When serializing your data classes to dict, you can pass a special factory, which allows you to transform data according to your need. With functools.reduce
, you can combine an array of functions, to transform different data types.
I hope this will come in handy!