Model test results and data distribution

In the Verta Model Catalog, users can conveniently log and visualize model validation results, as well as gain insights into the training and validation datasets. This functionality helps in monitoring and managing model performance tests and data insights. Additionally, it serves as a valuable resource for other stakeholders who can review the information before a model is deployed to production.

The example illustrates how to log and visualize model test results, attributes, and training data distributions using a diverse set of supported data types. Note that the information is logged with the model version (registered model version)

from verta import Client

model_version = client.create_registered_model().create_version()

model_version.add_attributes({
    'library': "scikit-learn",
    'model_type': "logistic regression",
 })
    
# log model performance metrics confusion matrix

from verta.data_types import ConfusionMatrix
data = ConfusionMatrix(
value=[
     [650000, 100000],
     [24000, 3330000],
  ],
  labels=["high", "low"],
  )
model_version.add_attribute("Income_Confusion_Matrix", data)

# log model performance metrics table

from verta.data_types import Table
data = Table(
data=[["Accuracy", "70.6%"], ["Precision", "40.7%"], ["Recall", "50.4%"], ["F1", "45%"]],
columns=["metric", "value"],
)
model_version.add_attribute("Performance_metrics", data)
    
# log training data distribution metrics discrete histogram (from the training dataset)
 
from verta.data_types import DiscreteHistogram
data = DiscreteHistogram(
buckets=["yes", "no", "dont know"],
data=[1100, 22200,15000],
)
model_version.add_attribute("Response_Histogram", data)
    
# log training data distribution metrics line chart (from the training dataset)
    
from verta.data_types import Line
data = Line(
x=[1, 2, 3,17,18,24,33,44,58,67],
y=[1, 4, 9,90,45,34,34,78,14,45],
)
model_version.add_attribute("Responses_Over_Time", data) 

Once logged, the information is available under the Model Insights section in "Reproduce" tab of a registered model version.

To access a comprehensive list of all supported data types, kindly refer to the API documentation.

Last updated