Deploying models
Verta can deploy models either using the Verta Inference system or export models registered within Verta for use in other systems (e.g., Spark, SageMaker.) This tutorial focuses on deploying models using the Verta Deployment infrastructure.
The key concept in Verta for model deployment is an Endpoint. An endpoint is a URL where a deployed model becomes available for use. Deploying a model is therefore a 2-step process:
Create an endpoint
Update the endpoint with a model
We'll look at these in turn.
1. Create an endpoint
Users can create an endpoint using Client.create_endpoint()
as follows:
2. Updating the endpoint with a deployed model
Once an endpoint has been created, we update the endpoint with a deployed model. Verta supports two common paths to deploy models:
Deploy a Registered Model Version
Deploy Experiment Run (typically for testing)
Deploy via a Registered Model Version
The Verta Model Registry is a staging area for models that are to be deployed into production. As a result, deploying Registered Model Versions is the recommended way for deploying models in a production setting.
To update an endpoint, use the update
method with a model version. The wait=True
parameter indicates that the call should not return until the update has been successfully applied.
Once an endpoint has finished updating, you can make REST calls to the endpoint via your language of choice. The Verta Python library provides convenience functions to enable making predictions against the endpoint.
Deploy via an Experiment Run
While deployment functionality similar to RMVs is available for deploying Experiment Runs, it does not have the same safeguards as deploying RMVs. So Experiment Run deployment is only recommended in interactive testing scenarios.
Last updated