Benchmarks#

Measure principles#

These benchmarks aim to make a complete, fair, and reliable comparison between different libraries among different versions of Python.

If you find a mistake in benchmarking methods or you want to add another library to the comparison create a new issue.

All benchmarks are made via pyperf – an advanced library used to measure the performance of Python interpreters. It takes care of calibration, warming up, and gauging.

To handle a vast number of benchmarks variations and make pyperf API more convenient new internal framework was created. It adds no overhead and is intended only to orchestrate pyperf runs.

All measurements exclude the time required to initialize and generate the conversion function.

Each library is tested with different options that may affect performance.

All benchmarks listed below were produced with libraries:

Library	Used version	Last version
adaptix	`3.0.0a6`
cattrs	`23.1.2`
dataclass_factory	`2.16`
marshmallow	`3.20.1`
mashumaro	`3.10`
msgspec	`0.18.4`
pydantic	`2.4.2`
schematics	`2.1.1`

Benchmarks analysis#

Important

Serializing and deserializing libraries have a lot of options that customize the conversion process. These parameters may greatly affect performance but there is no way to create benchmarks for each combination of these options. So, performance for your specific case may be different.

Simple Structures (loading)#

This benchmark examines the loading of basic structures natively supported by all the libraries.

The library has to produce models from dict:

from dataclasses import dataclass
from typing import List


@dataclass
class Review:
    id: int
    title: str
    rating: float
    content: str  # renamed to 'text'


@dataclass
class Book:
    id: int
    name: str
    reviews: List[Review]  # contains 100 items

Source Code Raw data

Cases description

adaptix

dt_all, dt_first and dt_disable expresses that debug_trail parameter of Retort set to DebugTrail.ALL, DebugTrail.FIRST, DebugTrail.DISABLE (doc)

sc refers to that strict_coercion option of Retort is activated (doc)

msgspec

strict implies that parameter strict at convert is enabled (doc)

no_gc points to that models have disabled gc option (doc)

cattrs

dv indicates that Converter option detailed_validation is enabled (doc)

dataclass_factory

dp denotes that parameter debug_path of Factory is set to True (doc)

mashumaro

lc signifies that lazy_compilation flag of model Config is activated (doc)

pydantic

strict means that parameter strict at model_config is turned on (doc)

Notes about implementation:

marshmallow can not create an instance of dataclass or another model, so, @post_load hook was used (doc)
msgspec can not be built for pypy

Simple Structures (dumping)#

This benchmark studies the dumping of basic structures natively supported by all the libraries.

The library has to convert the model instance to dict used at loading benchmark:

from dataclasses import dataclass
from typing import List


@dataclass
class Review:
    id: int
    title: str
    rating: float
    content: str  # renamed to 'text'


@dataclass
class Book:
    id: int
    name: str
    reviews: List[Review]  # contains 100 items

Source Code Raw data

Cases description

adaptix

dt_all, dt_first and dt_disable expresses that debug_trail parameter of Retort set to DebugTrail.ALL, DebugTrail.FIRST, DebugTrail.DISABLE (doc)

msgspec

no_gc points to that models have disabled gc option (doc)

cattrs

dv indicates that Converter option detailed_validation is enabled (doc)

mashumaro

lc signifies that lazy_compilation flag of model Config is activated (doc)

pydantic

strict means that parameter strict at model_config is turned on (doc)

asdict

standard library function dataclasses.asdict was used

Notes about implementation:

asdict does not support renaming, produced dict contains the original field name
msgspec can not be built for pypy
pydantic requires using json mode of model_dump method to produce json serializable dict (doc)

GitHub Issues (loading)#

This benchmark examines libraries using real-world examples. It involves handling a slice of a CPython repository issues snapshot fetched via the GitHub REST API.

The library has to produce models from dict:

Processed models

The original endpoint returns an array of objects. Some libraries have no sane way to process a list of models, so root level list wrapped with GetRepoIssuesResponse model.

These models represent most of the fields returned by the endpoint, but some data are skipped. For example, milestone is missed out, because the CPython repo does not use it.

from dataclasses import dataclass
from datetime import datetime
from enum import Enum
from typing import List, Optional


class IssueState(str, Enum):
    OPEN = "open"
    CLOSED = "closed"


class StateReason(str, Enum):
    COMPLETED = "completed"
    REOPENED = "reopened"
    NOT_PLANNED = "not_planned"


class AuthorAssociation(str, Enum):
    COLLABORATOR = "COLLABORATOR"
    CONTRIBUTOR = "CONTRIBUTOR"
    FIRST_TIMER = "FIRST_TIMER"
    FIRST_TIME_CONTRIBUTOR = "FIRST_TIME_CONTRIBUTOR"
    MANNEQUIN = "MANNEQUIN"
    MEMBER = "MEMBER"
    NONE = "NONE"
    OWNER = "OWNER"


@dataclass
class SimpleUser:
    login: str
    id: int
    node_id: str
    avatar_url: str
    gravatar_id: Optional[str]
    url: str
    html_url: str
    followers_url: str
    following_url: str
    gists_url: str
    starred_url: str
    subscriptions_url: str
    organizations_url: str
    repos_url: str
    events_url: str
    received_events_url: str
    type: str
    site_admin: bool
    name: Optional[str] = None
    email: Optional[str] = None
    starred_at: Optional[datetime] = None


@dataclass
class Label:
    id: int
    node_id: str
    url: str
    name: str
    description: Optional[str]
    color: str
    default: bool


@dataclass
class Reactions:
    url: str
    total_count: int
    plus_one: int  # renamed to '+1'
    minus_one: int  # renamed to '-1'
    laugh: int
    confused: int
    heart: int
    hooray: int
    eyes: int
    rocket: int


@dataclass
class PullRequest:
    diff_url: Optional[str]
    html_url: Optional[str]
    patch_url: Optional[str]
    url: Optional[str]
    merged_at: Optional[datetime] = None


@dataclass
class Issue:
    id: int
    node_id: str
    url: str
    repository_url: str
    labels_url: str
    comments_url: str
    events_url: str
    html_url: str
    number: int
    state: IssueState
    state_reason: Optional[StateReason]
    title: str
    user: Optional[SimpleUser]
    labels: List[Label]
    assignee: Optional[SimpleUser]
    assignees: Optional[List[SimpleUser]]
    locked: bool
    active_lock_reason: Optional[str]
    comments: int
    closed_at: Optional[datetime]
    created_at: Optional[datetime]
    updated_at: Optional[datetime]
    author_association: AuthorAssociation
    reactions: Optional[Reactions] = None
    pull_request: Optional[PullRequest] = None
    body_html: Optional[str] = None
    body_text: Optional[str] = None
    timeline_url: Optional[str] = None
    body: Optional[str] = None


@dataclass
class GetRepoIssuesResponse:
    data: List[Issue]

Source Code Raw data

Cases description

adaptix

dt_all, dt_first and dt_disable expresses that debug_trail parameter of Retort set to DebugTrail.ALL, DebugTrail.FIRST, DebugTrail.DISABLE (doc)

sc refers to that strict_coercion option of Retort is activated (doc)

msgspec

strict implies that parameter strict at convert is enabled (doc)

no_gc points to that models have disabled gc option (doc)

cattrs

dv indicates that Converter option detailed_validation is enabled (doc)

dataclass_factory

dp denotes that parameter debug_path of Factory is set to True (doc)

mashumaro

lc signifies that lazy_compilation flag of model Config is activated (doc)

Notes about implementation:

marshmallow can not create an instance of dataclass or another model, so, @post_load hook was used (doc)
msgspec can not be built for pypy
pydantic strict mode accepts only enum instances for the enum field, so, it cannot be used at this benchmark (doc)
cattrs can not process datetime out of the box. Custom structure hook lambda v, tp: datetime.fromisoformat(v) was used. This function does not generate a descriptive error, therefore production implementation could be slower.

GitHub Issues (dumping)#

This benchmark examines libraries using real-world examples. It involves handling a slice of a CPython repository issues snapshot fetched via the GitHub REST API.

The library has to convert the model instance to dict used at loading benchmark:

Processed models

The original endpoint returns an array of objects. Some libraries have no sane way to process a list of models, so root level list wrapped with GetRepoIssuesResponse model.

These models represent most of the fields returned by the endpoint, but some data are skipped. For example, milestone is missed out, because the CPython repo does not use it.

GitHub API distinct nullable fields and optional fields. So, default values must be omitted at dumping, but fields with type Optional[T] without default must always be presented

from dataclasses import dataclass
from datetime import datetime
from enum import Enum
from typing import List, Optional


class IssueState(str, Enum):
    OPEN = "open"
    CLOSED = "closed"


class StateReason(str, Enum):
    COMPLETED = "completed"
    REOPENED = "reopened"
    NOT_PLANNED = "not_planned"


class AuthorAssociation(str, Enum):
    COLLABORATOR = "COLLABORATOR"
    CONTRIBUTOR = "CONTRIBUTOR"
    FIRST_TIMER = "FIRST_TIMER"
    FIRST_TIME_CONTRIBUTOR = "FIRST_TIME_CONTRIBUTOR"
    MANNEQUIN = "MANNEQUIN"
    MEMBER = "MEMBER"
    NONE = "NONE"
    OWNER = "OWNER"


@dataclass
class SimpleUser:
    login: str
    id: int
    node_id: str
    avatar_url: str
    gravatar_id: Optional[str]
    url: str
    html_url: str
    followers_url: str
    following_url: str
    gists_url: str
    starred_url: str
    subscriptions_url: str
    organizations_url: str
    repos_url: str
    events_url: str
    received_events_url: str
    type: str
    site_admin: bool
    name: Optional[str] = None
    email: Optional[str] = None
    starred_at: Optional[datetime] = None


@dataclass
class Label:
    id: int
    node_id: str
    url: str
    name: str
    description: Optional[str]
    color: str
    default: bool


@dataclass
class Reactions:
    url: str
    total_count: int
    plus_one: int  # renamed to '+1'
    minus_one: int  # renamed to '-1'
    laugh: int
    confused: int
    heart: int
    hooray: int
    eyes: int
    rocket: int


@dataclass
class PullRequest:
    diff_url: Optional[str]
    html_url: Optional[str]
    patch_url: Optional[str]
    url: Optional[str]
    merged_at: Optional[datetime] = None


@dataclass
class Issue:
    id: int
    node_id: str
    url: str
    repository_url: str
    labels_url: str
    comments_url: str
    events_url: str
    html_url: str
    number: int
    state: IssueState
    state_reason: Optional[StateReason]
    title: str
    user: Optional[SimpleUser]
    labels: List[Label]
    assignee: Optional[SimpleUser]
    assignees: Optional[List[SimpleUser]]
    locked: bool
    active_lock_reason: Optional[str]
    comments: int
    closed_at: Optional[datetime]
    created_at: Optional[datetime]
    updated_at: Optional[datetime]
    author_association: AuthorAssociation
    reactions: Optional[Reactions] = None
    pull_request: Optional[PullRequest] = None
    body_html: Optional[str] = None
    body_text: Optional[str] = None
    timeline_url: Optional[str] = None
    body: Optional[str] = None


@dataclass
class GetRepoIssuesResponse:
    data: List[Issue]

Source Code Raw data

Cases description

adaptix

dt_all, dt_first and dt_disable expresses that debug_trail parameter of Retort set to DebugTrail.ALL, DebugTrail.FIRST, DebugTrail.DISABLE (doc)

msgspec

no_gc points to that models have disabled gc option (doc)

cattrs

dv indicates that Converter option detailed_validation is enabled (doc)

mashumaro

lc signifies that lazy_compilation flag of model Config is activated (doc)

pydantic

strict means that parameter strict at model_config is turned on (doc)

asdict

standard library function dataclasses.asdict was used

Notes about implementation:

asdict does not support renaming, produced dict contains the original field name
msgspec can not be built for pypy
pydantic requires using json mode of model_dump method to produce json serializable dict (doc)
cattrs can not process datetime out of the box. Custom unstructure hook datetime.isoformat was used.
marshmallow can not skip None values for specific fields out of the box. @post_dump is used to remove these fields.