Endpoint resources¶
Through an endpoint update, you can configure the limits for compute resources available to the deployment.
Using the client¶
Endpoint.update()
provides a parameter for
configuring the endpoint’s compute resources. It can be used alongside any update strategy.
from verta.endpoint.update import DirectUpdateStrategy
endpoint.update(
model_version, DirectUpdateStrategy(),
resources=resources,
)
resources
specifies the computational resources that will be available to the model when it is
deployed.
from verta.endpoint.resources import Resources
resources = Resources(cpu=.25, memory="512Mi")
In this example, each replica will be provided a fourth of a CPU core and 512 Mi of RAM. For more information about available resources and units, see the Endpoint Resources API documentation.
Using the CLI¶
Compute resources can also be configured via the CLI:
verta deployment update endpoint /some-path --model-version-id "<id>" \
--strategy direct \
--resources '{"cpu": 0.25, "memory": "512Mi"}'
--resources
takes a JSON string representing its values. The Python API documentation for
Endpoint Resources contains a JSON-equivalent example for the object.