An introduction to Pydbantic — a single model solution to Data Verification & Storage
Pydbantic inherits its’ name from pydantic, a library for “Data parsing and validation using Python type hints”.
In a nutshell, pydantic provides a framework for validating input between interfaces to ensure the correct input data( type, structure, required, optional) are met, eliminating the need to add logic to catch & verify bad input.
Pydantic takes care of ensuring data type / format correctness, but this data does not innately persist after creation. A common task of a developer is to somehow translate these models into a “Database Native” format to persist.
Pydantic provides many helpful methods for this serialization. But most pydantic models are not created equally. This means for each model, separate logic is often needed to translate a given field into the “Database Native” column type. Consequently, anytime data is serialized, it must be de-serialized again before it can be used again.
Where and how data is stored, is often as daunting as writing & managing serialization logic. The python ecosystem provides many choices, and there are generally no “wrong options”, but until recently each option required separate models for Data Validation & Database Tables.
Creating a Model
The best way to understand and appreciate the “Single Model” model, is to see it in action. Here is a simple model that maps some very basic information that a company might store about employees, which will continue to be expanded upon:
# Application Structure
├── app.py
└── models.py
The model can be used right away, just like any pydantic model, to begin to see its’ potential.
$ python app.pyid='0e2eeddd-bfa2-4e81-9d38-87b098039b3d' salary=20000.0 is_employed=True date_employed='2021-11-15T09:21:02.358015'
Up until now, the application has only created only an Employee instance, providing only salary
and is_employed
. The other fields are generated at instantiation using the provided “zero-argument” functions. This instance can be used just like any other pydantic model, with some additional methods available for persistence( if plugged into a database).
Saving a Model
├── app.py
├── db.py
├── models.py
The Employee
model is connected to a database by first calling the Database.create(
factory method, which connects the model to a new or existing database. If new, the table will be created, if existing, the table will be checked for modifications & migrated if needed.
There are effectively 3 ways to save a model in pydbantic
, .insert()
, .save()
, and .create(
two of the latter demonstrated above.
Querying a Model
├── app.py
├── company.db
├── db.py
├── models.py
$ python app.py[
Employee(id='0877c661-0d53-4435-a800-425b64c08c43', salary=100000.0, is_employed=True, date_employed='2021-11-15T09:37:51.105720'),
Employee(id='67c476d9-3830-434a-9dfe-8ad3cc914129', salary=100000.0, is_employed=True, date_employed='2021-11-15T09:37:51.118808')
]
Primary Key
$ python app.py
employee is id='0877c661-0d53-4435-a800-425b64c08c43' salary=100000.0 is_employed=True date_employed='2021-11-15T09:37:51.105720'
Filtering
$ python app.py[
Employee(id='0877c661-0d53-4435-a800-425b64c08c43', salary=100000.0, is_employed=True, date_employed='2021-11-15T09:37:51.105720'),
Employee(id='67c476d9-3830-434a-9dfe-8ad3cc914129', salary=100000.0, is_employed=True, date_employed='2021-11-15T09:37:51.118808')
]
Updating a Model
$ python app.py[
Employee(id='0877c661-0d53-4435-a800-425b64c08c43', salary=102000.0, is_employed=True, date_employed='2021-11-15T09:37:51.105720'),
Employee(id='67c476d9-3830-434a-9dfe-8ad3cc914129', salary=102000.0, is_employed=True, date_employed='2021-11-15T09:37:51.118808')
]
Updating a model is available via .save()
or .update()
method of a model instance.
Deleting data from Model
$ python app.py[]
Just like updates, deleting data from a model be done via an instance of a model via .delete()
.
Model Relationships
A new DataBaseModel
is defined here with name Positions
to be associated with Employees
via a new field position
.
Take special note at the default value of position
which will be used if left unspecified when an Employee is created AND for migrations when a field does not already exist.
$ python app.py11-15 10:23 pydb WARNING Migrations may be required. Will attempt in order: [<class 'models.Employee'>]11-15 10:23 pydb WARNING Migration Required: ['New Column: position'][
Employee(id='c61e1114-76fd-4356-a167-ffc802c7ad93', salary=100000.0, is_employed=True, date_employed='2021-11-15T11:23:03.047574', position=Positions(name='Manager', department='HR'))
]
[
Positions(name='Manager', department='HR')
]
At Employees.create()
creation, the Positions(name=’Manager’, department=’HR’)
was created, since no Positions
object existed yet with name Manager
.
Also note the extra logs before the application started. Migrations may be required. Will attempt in order: [<class 'models.Employee'>]
This log and the following logs indicate the detected change to Model Employee
along with the notice that a migration will take place.
The apt minded, may notice that Positions
was never directly plugged into db.py
. DatabaseModel
objects with related DatabaseModel
will ensure related tables are also plugged into the database at runtime. DatabaseModel
definitions will need to be manually fed into Database
creation if un-referenced by another DatabaseModel
already fed into a Database
.
Related Model Changes
In the above models example, a new BaseModel
is defined to represent a new DataBaseModel
Department
location. The department
field of the existing Positions
model is replaced with positions_department
with a default value of Department(name=’HR’, company=’FOOBAR’)
.
$ python app.py11-15 12:38 pydb WARNING Migrations may be required. Will attempt in order: [<class 'models.Positions'>]11-15 12:38 pydb WARNING Migration Required: ['New Column: position_department', 'Deleted Column: department'][
Employee(id='c61e1114-76fd-4356-a167-ffc802c7ad93', salary=100000.0, is_employed=True, date_employed='2021-11-15T11:23:03.047574', position=Positions(name='Manager', position_department=Department(name='HR', company='FOOBAR', location=Coordinates(latitude=52.299387, longitude=4.976949))))
][
Positions(name='Manager', position_department=Department(name='HR', company='FOOBAR', location=Coordinates(latitude=52.299387, longitude=4.976949)))
]
[
Department(name='HR', company='FOOBAR', location=Coordinates(latitude=52.299387, longitude=4.976949))
]
At runtime, the new model was imported and added to the Database
, migration requirements detected, and performed. All the while, the existing objects for Employee
and Positions
models continue to pull in related model changes.
Deleting Related Data
Special consideration on models should be observed when removing objects from related DataBaseModels
before first updating / removing objects that relate.
In the above example, all Positions
objects are deleted, while references to these deleted objects exist in Employees
objects. The result of querying Employees
is missing Positions
data, i.e None
. None
type however is not an allowed type of model Employee
attribute position
.
The simple fix to this, is to allow NoneType
.
Here the attribute position
of Employee
is annotated with Union[Positions, None]
thus allowing a None value for the attribute. Using the existing Employee
objects instances, new Positions
objects can be assigned and updated if needed.
Caching with Pydbantic
Cache can be enabled by simply adding cache_enabled=True
to Database.create()
along with a defined redis_url
.
When cache is enabled, db read responses are cached & automatically invalidated on changes.
Testing with Pydbantic
Testing is easy with Pydbantic, simply add testing=True
to database.create()
fixtures, and linked database tables & cache will be automatically cleared before starting.