Verta Model Monitoring lets you monitor drift, outlier and model performance metrics.
Note: Model performance metrics can be computed for classification models only.
Get started with monitoring:
Create a Registered Model and Registered Model Versions (RMVs) to it.
registered_model = client.get_or_create_registered_model(name="census-model")
model_version = registered_model.create_standard_model(
name = "v1",
model_cls = CensusIncomeClassifier,
model_api = ModelAPI(X_train, Y_train_with_confidence),
environment = Python(requirements=["scikit-learn"]),
artifacts = artifacts_dict
ModelAPIcaptures your model schema that helps the monitoring system automatically define monitoring metrics, dashboards and alerts for features and prediction data. As of Verta Release 2022.04, confidence scores is required for classification models in order to accurately compute ROC and PR curves.
Upload your reference data as a dataset version and link the RMV with the training dataset version. The given training dataset will help facilitate downstream drift monitoring and is called a reference set. You do not need to upload your entire training set, but a statistically significant representation that mirrors your training/reference data distribution.
dataset: Dataset = client.get_or_create_dataset("census-dataset")
dataset_version = dataset.create_version(Path(["census-train.csv"], enable_mdb_versioning=True))
Note: You are required to use the key "reference" and the dataset information will be uploaded in the cataloged RMV. You do not need to upload your entire training set, but a statistically significant representation that mirrors your training data distribution. This step is a must have for monitoring system to compute drift.
Deploy an endpoint with the model version. When an endpoint is deployed, the monitored model automatically appears in the Monitoring list view in webapp.
endpoint = client.get_or_create_endpoint("Census")
Start sending input data for prediction. Once the data has been sent to the system, you can navigate to the webapp to view dashboards.
deployed_model = endpoint.get_deployed_model()
id,_ = deployed_model.predict_with_id(input_feature)
Note: The model makes a prediction and assigns a unique UUID for the prediction. Ground truth is then registered with the system using the above UUID
Drift dashboard in webapp.
Ingest ground truth and the system will start computing performance metrics like accuracy, precision, confusion matrix etc.
endpoint.log_ground_truth(id, label, "output-class") # id, gt, prediction_col_name
Performance dashboard in webapp.