Verta can deploy models either using the Verta Inference system or export models registered within Verta for use in other systems (e.g., Spark, SageMaker.) This tutorial focuses on deploying models using the Verta Deployment infrastructure.
The key concept in Verta for model deployment is an Endpoint. An endpoint is a URL where a deployed model becomes available for use. Deploying a model is therefore a 2-step process:
Create an endpoint
Update the endpoint with a model
We'll look at these in turn.
Users can create an endpoint using
Client.create_endpoint() as follows:
endpoint = client.create_endpoint(path="/some-path")
Once an endpoint has been created, we update the endpoint with a deployed model. Verta supports two common paths to deploy models:
Deploy a Registered Model Version
Deploy Experiment Run (typically for testing)
The Verta Model Registry is a staging area for models that are to be deployed into production. As a result, deploying Registered Model Versions is the recommended way for deploying models in a production setting.
To update an endpoint, use the
update method with a model version. The
wait=True parameter indicates that the call should not return until the update has been successfully applied.
Once an endpoint has finished updating, you can make REST calls to the endpoint via your language of choice. The Verta Python library provides convenience functions to enable making predictions against the endpoint.
deployed_model = endpoint.get_deployed_model()deployed_model.predict(some_test_input)
While deployment functionality similar to RMVs is available for deploying Experiment Runs, it does not have the same safeguards as deploying RMVs. So Experiment Run deployment is only recommended in interactive testing scenarios.