Exporting metrics for an endpoint

Each endpoint publishes a number of performance metrics suitable for monitoring behavior and performance.

Supported Integrations

Datadog

The Verta platform supports exporting endpoint metrics as Custom Metrics to Datadog with the labels indicated below. For more information about using custom metrics within Datadog, please consult the Datadog documentation.

CloudWatch

Endpoint metrics can also be exported to Amazon CloudWatch as Metrics. The labels for each metric below are associated to CloudWatch Metrics as Dimensions. Please see the CloudWatch documentation for information about using Metrics in CloudWatch.

Note: A metric will not be exported to CloudWatch unless its scenario has occurred at least once. This means one cannot build a dashboard on api_throughput_5xx_by_endpoint errors until the endpoint returns at least one 5xx error.

Metric Types

Endpoint State

NameDescriptionMin ValueMax ValuesLabels

state_up

indicates worker is available to service requests

0

1

worker, model_name, model_version, endpoint_path

state_pending

indicates worker is in 'pending' state

0

1

worker, model_name, model_version, endpoint_path

state_allocated

indicates worker is in 'running' state

0

1

worker, model_name, model_version, endpoint_path

state_restart_count

count of the number of worker restarts

0

worker, model_name, model_version, endpoint_path

workers_up_by_endpoint

number of workers currently up for an endoint

0

model_name, model_version, endpoint_path

When an endpoint is being updated, its worker has the following state transition:

pending (waiting for resources) -> allocated (waiting to start) -> up (running)

Endpoint Utilization

NameDescriptionMin ValueMax ValuesTime RangeLabels

api_throughput

rate of requests made to a worker (requests/second)

0

2m

worker, model_name, model_version, endpoint_path

api_latency_avg

average request latency for a worker (seconds)

0

2m

worker, model_name, model_version, endpoint_path

api_latency_p99

99th percentile upper bound for request latency for a worker (seconds)

0

2m

worker, model_name, model_version, endpoint_path

api_throughput_by_endpoint_by_code

rate of requests made to an endpoint, across workers (requests/second)

0

2m

endpoint_path, code

api_throughput_by_endpoint

rate of requests made to an endpoint, across workers and codes (requests/second)

0

2m

endpoint_path

api_throughput_2xx_by_endpoint

rate of requests with 2xx codes made to an endpoint, across workers (requests/second)

0

2m

endpoint_path

api_throughput_3xx_by_endpoint

rate of requests with 3xx codes made to an endpoint, across workers (requests/second)

0

2m

endpoint_path

api_throughput_4xx_by_endpoint

rate of requests with 4xx codes made to an endpoint, across workers (requests/second)

0

2m

endpoint_path

api_throughput_5xx_by_endpoint

rate of requests with 5xx codes made to an endpoint, across workers (requests/second)

0

2m

endpoint_path

api_throughput_not_2xx_by_endpoint

rate of requests with non-2xx codes made to an endpoint, across workers (requests/second)

0

2m

endpoint_path

api_latency_p99_by_endpoint

99th percentile upper bound for request latency for an endpoint, across workers (seconds)

0

2m

endpoint_path

api_latency_p50_by_endpoint

50th percentile upper bound for request latency for an endpoint, across workers (seconds)

0

2m

endpoint_path

api_increase_2xx_by_endpoint

count of requests with 2xx codes made to an endpoint, across workers (requests/second)

0

2m

endpoint_path

api_increase_3xx_by_endpoint

count of requests with 3xx codes made to an endpoint, across workers (requests/second)

0

2m

endpoint_path

api_increase_4xx_by_endpoint

count of requests with 4xx codes made to an endpoint, across workers (requests/second)

0

2m

endpoint_path

api_increase_5xx_by_endpoint

count of requests with 5xx codes made to an endpoint, across workers (requests/second)

0

2m

endpoint_path

api_increase_not_2xx_by_endpoint

count of requests with non-2xx codes made to an endpoint, across workers (requests/second)

0

2m

endpoint_path

Endpoint Resources

NameDescriptionMin ValueMax ValuesTime RangeLabels

resources_cpu

[deprecated]

0

2m

worker, model_name, model_version, endpoint_path

resources_cpu_cores

amount of cpu utilized by worker (cores)

0

2m

worker, model_name, model_version, endpoint_path

resources_cpu_ratio

amount of cpu utilized by worker (ratio)

0

1

2m

worker, model_name, model_version, endpoint_path

resources_memory

[deprecated]

0

NA

worker, model_name, model_version, endpoint_path

resources_memory_bytes

amount of memory utilized by worker (bytes)

0

NA

worker, model_name, model_version, endpoint_path

resources_memory_ratio

amount of memory utilized by worker (ratio)

0

1

NA

worker, model_name, model_version, endpoint_path

resources_rx_bytes

amount of received network traffic by worker (bytes/sec)

0

2m

worker, model_name, model_version, endpoint_path

resources_tx_bytes

amount of transmitted network traffic by worker (bytes/sec)

0

2m

worker, model_name, model_version, endpoint_path

Label Definitions

  • worker: name of the worker for the endpoint

  • model_name: name of the registered model configured in the endpoint

  • model_version: version of registered model configured in endpoint

  • endpoint_path: the URI path suffix configured for the endpoint

Last updated