Updating endpoints with canary

As an alternative to updating an Endpoint all at once, Verta offers the ability to roll out a new model incrementally while monitoring its behavior to prevent a problematic model from being deployed completely.

This is the principle behind a canary update shown in this tutorial.

Using the client

Before, a new model was deployed using a direct update strategy. This time, a CanaryUpdateStrategy will be used:

from verta.endpoint.update import CanaryUpdateStrategy
from verta.endpoint.update.rules import MaximumRequestErrorPercentageThresholdRule

strategy = CanaryUpdateStrategy(interval=10, step=0.2)
endpoint.update(model_version, strategy)  # or endpoint.update(run, strategy)

To perform a canary update, it must be provided with an interval (in seconds) describing how often to update the deployment, and a step (as a ratio between 0 and 1) describing how much of the deployment should be updated per interval.

A canary update strategy must also have at least one rule associated with it. In this case, the update will monitor the request error percentage; if it exceeds the threshold we have set (10%), the rollout will be halted. See the canary-rules API documentation for additional rules that can be used.

Using the web UI

Go to the update tab of the endpoint, select the canary rollout strategy and confugure the canary rules. You can configure one or more rules. The rules include threshold for maximum request error percentage, maximum server error percentage, maximum average latency, maximum p90 latency.

Here is the web UI confiiguration example:

Last updated