An introduction to Pydbantic — a single model solution to Data Verification & Storage

Joshua Jamison
ITNEXT
Published in
5 min readNov 15, 2021

--

Pydbantic inherits its’ name from pydantic, a library for “Data parsing and validation using Python type hints”.

In a nutshell, pydantic provides a framework for validating input between interfaces to ensure the correct input data( type, structure, required, optional) are met, eliminating the need to add logic to catch & verify bad input.

Pydantic takes care of ensuring data type / format correctness, but this data does not innately persist after creation. A common task of a developer is to somehow translate these models into a “Database Native” format to persist.

Pydantic provides many helpful methods for this serialization. But most pydantic models are not created equally. This means for each model, separate logic is often needed to translate a given field into the “Database Native” column type. Consequently, anytime data is serialized, it must be de-serialized again before it can be used again.

Where and how data is stored, is often as daunting as writing & managing serialization logic. The python ecosystem provides many choices, and there are generally no “wrong options”, but until recently each option required separate models for Data Validation & Database Tables.

Creating a Model

The best way to understand and appreciate the “Single Model” model, is to see it in action. Here is a simple model that maps some very basic information that a company might store about employees, which will continue to be expanded upon:

# Application Structure
├── app.py
└── models.py

The model can be used right away, just like any pydantic model, to begin to see its’ potential.

$ python app.pyid='0e2eeddd-bfa2-4e81-9d38-87b098039b3d' salary=20000.0 is_employed=True date_employed='2021-11-15T09:21:02.358015'

Up until now, the application has only created only an Employee instance, providing only salary and is_employed . The other fields are generated at instantiation using the provided “zero-argument” functions. This instance can be used just like any other pydantic model, with some additional methods available for persistence( if plugged into a database).

Saving a Model

├── app.py
├── db.py
├── models.py

The Employee model is connected to a database by first calling the Database.create( factory method, which connects the model to a new or existing database. If new, the table will be created, if existing, the table will be checked for modifications & migrated if needed.

There are effectively 3 ways to save a model in pydbantic , .insert(), .save(), and .create( two of the latter demonstrated above.

Querying a Model

├── app.py
├── company.db
├── db.py
├── models.py
$ python app.py[
Employee(id='0877c661-0d53-4435-a800-425b64c08c43', salary=100000.0, is_employed=True, date_employed='2021-11-15T09:37:51.105720'),
Employee(id='67c476d9-3830-434a-9dfe-8ad3cc914129', salary=100000.0, is_employed=True, date_employed='2021-11-15T09:37:51.118808')
]

Primary Key

$ python app.py
employee is id='0877c661-0d53-4435-a800-425b64c08c43' salary=100000.0 is_employed=True date_employed='2021-11-15T09:37:51.105720'

Filtering

$ python app.py[
Employee(id='0877c661-0d53-4435-a800-425b64c08c43', salary=100000.0, is_employed=True, date_employed='2021-11-15T09:37:51.105720'),
Employee(id='67c476d9-3830-434a-9dfe-8ad3cc914129', salary=100000.0, is_employed=True, date_employed='2021-11-15T09:37:51.118808')
]

Updating a Model

$ python app.py[
Employee(id='0877c661-0d53-4435-a800-425b64c08c43', salary=102000.0, is_employed=True, date_employed='2021-11-15T09:37:51.105720'),
Employee(id='67c476d9-3830-434a-9dfe-8ad3cc914129', salary=102000.0, is_employed=True, date_employed='2021-11-15T09:37:51.118808')
]

Updating a model is available via .save() or .update() method of a model instance.

Deleting data from Model

$ python app.py[]

Just like updates, deleting data from a model be done via an instance of a model via .delete() .

Model Relationships

A new DataBaseModel is defined here with name Positions to be associated with Employees via a new field position .

Take special note at the default value of position which will be used if left unspecified when an Employee is created AND for migrations when a field does not already exist.

$ python app.py11-15 10:23 pydb         WARNING  Migrations may be required. Will attempt in order: [<class 'models.Employee'>]11-15 10:23 pydb         WARNING  Migration Required: ['New Column: position'][
Employee(id='c61e1114-76fd-4356-a167-ffc802c7ad93', salary=100000.0, is_employed=True, date_employed='2021-11-15T11:23:03.047574', position=Positions(name='Manager', department='HR'))
]
[
Positions(name='Manager', department='HR')
]

At Employees.create()creation, the Positions(name=’Manager’, department=’HR’) was created, since no Positions object existed yet with name Manager .

Also note the extra logs before the application started. Migrations may be required. Will attempt in order: [<class 'models.Employee'>] This log and the following logs indicate the detected change to Model Employee along with the notice that a migration will take place.

The apt minded, may notice that Positions was never directly plugged into db.py . DatabaseModel objects with related DatabaseModel will ensure related tables are also plugged into the database at runtime. DatabaseModel definitions will need to be manually fed into Database creation if un-referenced by another DatabaseModel already fed into a Database .

Related Model Changes

In the above models example, a new BaseModel is defined to represent a new DataBaseModel Department location. The department field of the existing Positions model is replaced with positions_department with a default value of Department(name=’HR’, company=’FOOBAR’) .

$ python app.py11-15 12:38 pydb         WARNING  Migrations may be required. Will attempt in order: [<class 'models.Positions'>]11-15 12:38 pydb         WARNING  Migration Required: ['New Column: position_department', 'Deleted Column: department'][
Employee(id='c61e1114-76fd-4356-a167-ffc802c7ad93', salary=100000.0, is_employed=True, date_employed='2021-11-15T11:23:03.047574', position=Positions(name='Manager', position_department=Department(name='HR', company='FOOBAR', location=Coordinates(latitude=52.299387, longitude=4.976949))))
]
[
Positions(name='Manager', position_department=Department(name='HR', company='FOOBAR', location=Coordinates(latitude=52.299387, longitude=4.976949)))
]
[
Department(name='HR', company='FOOBAR', location=Coordinates(latitude=52.299387, longitude=4.976949))
]

At runtime, the new model was imported and added to the Database, migration requirements detected, and performed. All the while, the existing objects for Employee and Positions models continue to pull in related model changes.

Deleting Related Data

Special consideration on models should be observed when removing objects from related DataBaseModels before first updating / removing objects that relate.

In the above example, all Positions objects are deleted, while references to these deleted objects exist in Employees objects. The result of querying Employees is missing Positions data, i.e None . None type however is not an allowed type of model Employee attribute position .

The simple fix to this, is to allow NoneType .

Here the attribute position of Employee is annotated with Union[Positions, None] thus allowing a None value for the attribute. Using the existing Employee objects instances, new Positions objects can be assigned and updated if needed.

Caching with Pydbantic

Cache can be enabled by simply adding cache_enabled=True to Database.create() along with a defined redis_url .

When cache is enabled, db read responses are cached & automatically invalidated on changes.

Testing with Pydbantic

Testing is easy with Pydbantic, simply add testing=True to database.create() fixtures, and linked database tables & cache will be automatically cleared before starting.

Help make Pydbantic Better!

  • Enjoying the Library? Star & Fork me!
  • Have an Idea, Improvement, Bug, submit a issue here!

--

--