Towards Robust and Error-Free Python Code With Pydantic

Towards Robust and Error-Free Python Code With Pydantic
Credits: Author

Finally! Python autocomplete that actually works (like in C# and Java) and type hinting support.

Python is a dynamically typed language which means that type checking is performed at run-time (when executed). If there is an error in the code it will be thrown at execution. Languages such as Java, C# and C are statically typed meaning type checking is performed at compile-time. In this case, the error will be thrown before the program is run.

In a statically typed language, the type of constructs cannot be changed. The compiler needs to know the types beforehand. A variable declared as an int in C for example cannot be changed to a string later.

We can do this in Python however:

myVar = 1
myVar = "hello" #this works

This enhanced flexibility means that dynamically typed languages are slower at execution than statically typed ones. A lot of checking has to be done at run-time to figure out the type of variables and other constructs so that the program can be executed. This creates overhead.

Now that Python is the go-to language for machine learning there is an increasing use-case for developing APIs and web applications that serve machine learning models. It is a lot simpler to have a single language for creating models and wrapping them up in a user-facing application than a variety of languages.
However, for these full-stack applications the chances of type errors increase when type checking is performed at run-time rather than compile time. This is where Python type hinting helps. It allows us to declare types of programming constructs directly in our code.

First we look at basics of type hints.

Python type hints and code completion

To define a type hint for a function argument we can write a : (colon) followed by the type after the variable name. For non-singular types such as Lists, Dicts, Sets we need to import the typing package.

Let’s look at some code:

We define a function get_stuff() that appends the provided item to the item list fridge.
Afterward, all items in the fridge are capitalized.

The code works as expected returning the list of fruits:

['Apple', 'Grape', 'Pear', 'Orange']

Since we define fridge to be a list of strings VS Code (with PyLance and Python extensions) provides instant code completion. If you type fridge. notice how the suggestions pop up:

Type hinting code completion of list

Similarly, as we have defined fridge to be a list of strings, we can write x. to see all operations possible on every item in the fridge which is a string:

Type hinting code completion of string

As you can see, type hinting saves a ton of time as there is no need to go back and forth looking up methods and attributes from online documentation.

Pydantic Models

Data validation and settings management using python type annotations. pydantic enforces type hints at runtime, and provides user friendly errors when data is invalid. Source: Pydantic

Although Python supports type hinting, this is not enforced. So passing an object of an incorrect type is still possible and would cause an error if an unsupported operation is attempted. For example attempting str operations on an int type. Pydantic is a Python library that enforces this, meaning it circumvents such errors.
Let’s see an example to consolidate this point.

Let’s say we get some bad input to our function and the fridge contains an int along with the strings.
The rest of the code remains unchanged and we call get_stuff() with the modified fridge:

print(get_stuff("orange", ["apple", 1,  "pear"]))

What happens?

We get the following runtime error:

runtime error

Even though we declared x to be of type str the get_stuff() function happily accepts a List with one int element and toUpper() attempts to call capitalize() on the int object.

At this point it may seem like the benefits of type hinting are limited to autocompletion only.

We can refactor the code to use Pydantic. We define a data model that inherits from a Pydantic BaseModel. This is the main way to create data models in Pydantic.
Since this is a blueprint for how our data should be represented, we define it as a class.

Go ahead and install Pydantic with:

pip3 install pydantic

Then define a Frdige class that inherits from BaseModel like so:

from pydantic import BaseModel

class Fridge(BaseModel):
    items: List[str]

We give the Fridge class an attribute called items which will be a list of strings.
We create an instance of a Fridge and pass it as an argument when we call the get_stuff() function.
The refactored code looks as follows:

If we now attempt to run it again you will notice the code is error free!
The int gets casted to a string object and appended to the list giving the following return object:

['1', 'Apple', 'Pear', 'Orange']

You will also notice that we pass a Python set instead of a list when we create an instance of a Fridge object. Here again, Pydantic takes care of casting the set to a list !

You might be wondering what should be done if we do wish to have a list of mixed types such as a list that contains either strings or integers. For that we can use the Union type annotation which acts like a logical OR.
For example the Fridge would be as follows:

class Fridge(BaseModel):
    items: List[Union[int, str]]

Passing the following list to Fridge would now work:

[1, "apple", "orange", "pear"]

Please note that Pydantic gives precedence to the first type listed in the Union. So if we had instead written:

class Fridge(BaseModel):
    items: List[Union[str, int]]

Then the int in the passed list would be casted to a string even though int appears in the type annotation. This would give (which is not what we want):

["1", "apple", "orange", "pear"]

Ok we have covered a lot of ground!

Pydantic really shines when it comes to modelling more complex data types.
For that we need to look at recursive models.

Recursive Models

It is also possible to define recursive models in Pydantic for more complex data models.
A recursive model is a model that contains another model as a type definition in one of its attributes.
So instead of List[str] we could have List[Cars] where Cars would be a Pydantic model defined in our code. 

Onto another example!

Let’s assume we also want to store the number of each fruit in the fridge. To do this, we create a Fruit data model:

class Fruit(BaseModel):
    name:str
    num:int

In the Fridge data model we can define the list to be a list of Fruits instead of a list of ints:

class Fridge(BaseModel):
    items: List[Fruit]

The full code is as follows:

We call get_most_fruits() with a Fridge object containing a list of Fruit objects. Pretty straightforward.

We wish to return the fruit with the highest number. Before doing operations on the list of fruit we use the jsonable_encoder() method to convert the list into a JSON compatible type. If we hadn’t done this, then an element in the list would be of type Fruit which cannot be operated on.

After the encoding stage, we get a list of dict objects with key, value pairs corresponding to the name and num fields defined in the Fruit class.

We can now sort this list and return the fruit with the highest number.

Conclusion

In this post we had a recap of dynamically and statically typed languages.
We looked at type hinting in Python and use of Pydantic to enforce the type hints.

To conclude, type hinting helps:

  • Speed up software development through IDE autocompletion.
  • Contribute to increased code quality by making code easier to understand and read.
  • Improve coding style and overall software design.
  • Create more robust and error-free software by reducing run-time errors.
    Especially in large and complex software projects.

Hope you learned something useful in this post.
Next time we will look at FastAPI, a popular Python web framework that fully supports Pydantic.

Check out more articles here.