Deploying a PyTorch model
As mention in the deploying models guide, deploying models via Verta Inference is a two step process: (1) first create an endpoint, and (2) update the endpoint with a model.
This tutorial explains how Verta Inference can be used to deploy a PyTorch model.
1. Create an endpoint
Users can create an endpoint using Client.create_endpoint()
as follows:
2. Updating the endpoint with a RMV
As discussed in the Catalog Overview, there are multiple of ways to create an RMV for a PyTorch model.
First, if we are provided with a PyTorch model object, users can use the PyTorch convenience functions to create a Verta Standard Model.
Alternatively, a serialized PyTorch saved model can be used as an artifact in a model that extends VertaModelBase.
Note that the input and output of the predict function must be JSON serializable. For the full list of acceptable data types for model I/O, refer to the VertaModelBase documentation.
Prior to deploy, don't forget to test your model class locally as follows.
To ensure that the requirements specified in the model version are in fact adequate, you may build the model container locally or part of a continuous integration system. You may also deploy the model and make test predictions as show below.
Regardless of how a Registered Model Version has been created, the endpoint defined above can now be upated and we can make predictions against it.
The full code for this tutorial can be found here.
Last updated