Deploying models

Verta can deploy models either using the Verta Inference system or export models registered within Verta for use in other systems (e.g., Spark, SageMaker.) This tutorial focuses on deploying models using the Verta Deployment infrastructure.

The key concept in Verta for model deployment is an Endpoint. An endpoint is a URL where a deployed model becomes available for use. Deploying a model is therefore a 2-step process:

  1. Create an endpoint

  2. Update the endpoint with a model

We'll look at these in turn.

1. Create an endpoint

Users can create an endpoint using Client.create_endpoint() as follows:

endpoint = client.create_endpoint(path="/some-path")

2. Updating the endpoint with a deployed model

Once an endpoint has been created, we update the endpoint with a deployed model. Verta supports two common paths to deploy models:

  • Deploy a Registered Model Version

  • Deploy Experiment Run (typically for testing)

Deploy via a Registered Model Version

The Verta Model Registry is a staging area for models that are to be deployed into production. As a result, deploying Registered Model Versions is the recommended way for deploying models in a production setting.

To update an endpoint, use the update method with a model version. The wait=True parameter indicates that the call should not return until the update has been successfully applied.

endpoint.update(model_version, wait=True)

Once an endpoint has finished updating, you can make REST calls to the endpoint via your language of choice. The Verta Python library provides convenience functions to enable making predictions against the endpoint.

deployed_model = endpoint.get_deployed_model()
deployed_model.predict(some_test_input)

Deploy via an Experiment Run

While deployment functionality similar to RMVs is available for deploying Experiment Runs, it does not have the same safeguards as deploying RMVs. So Experiment Run deployment is only recommended in interactive testing scenarios.

Last updated