Log and visualize training data distribution

For a registered model ready for deployment, you can log the training data used for the model. All the feature distribution can then be tracked and visualized within the given registered model version. You stakeholders can review and audit the same.

This is how you can log training data for a registed model version.

Log training data

import pandas as pd
from verta import Client
HOST = "XXXX.XXXX.verta.ai"
client = Client(HOST)

df = pd.read_csv("train.csv")
in_df = df.iloc[:, :-1]
out_df = df.iloc[:, [-1]]

model_version = client.create_registered_model().create_version()
model_version.log_training_data_profile(in_df, out_df)

Visualize feature distribution of your training data

The feature distribution histograms plots will then be available in the Web UI in the registered model version page.

Last updated