Creating registered model versions from artifacts

Many times, a model may require artifact dependencies that are not capture in the model inference code. For example, a model class may require weights, a checkpoint, or a serialized version of the model. As a result, VertaModelBase has the ability to specify and use artifact dependencies.

This guide explains how you can use artifacts within your model class definition.

Basic model definition

As defined in the API reference, a Verta Standard Model must extend VertaModelBase. Specifically, the __init__() function of VertaModelBase takes as input an artifact dictionary which is a mapping from the name of an artifact to where the artifact is located on the filesystem. This information allows the end user to handle the artifacts as required by the system.

from verta.registry import VertaModelBase

class MyModel(VertaModelBase):
    def __init__(self, artifacts):
        """
        artifacts is a dictionary where the keys represent the names of
        artifacts and values represent the locations of these artifacts
        """
        pass

    def predict(self, data):
        pass

Artifacts from files

Assume that our model is a linear regression model (ax + b) and its weights have been stored in a JSON file called "weights.json" which looks as follows:

{
  "a": 1,
  "b": 0.5
}

Then a model using those weights may look as follows, using json.load() to read the JSON file:

from verta.registry import VertaModelBase

class MyModel(VertaModelBase):
    def __init__(self, artifacts):
        """
        artifacts is a dictionary where the keys represent the names of
        artifacts and values represent the locations of these artifacts
        """
        with open(artifacts["weights"], "r") as f:
            weights = json.load(f)

        self.a = weights['a']
        self.b = weights['b']

    def predict(self, data):
        predictions = []
        for input in data:
            predictions.append(input*self.a + self.b)
        return predictions

That's it. Now we can create a Registered Model Version encapsulating the model code as well as the artifact by mapping its local filepath to its key (used to open() the file in the model code) through the artifacts parameter:

model_version = registered_model.create_standard_model(
    name="v0",
    model_cls=MyModel,
    artifacts={"weights": "./weights.json"},
    environment=Python([]),
    labels=["research-purpose", "team-a"],
)

Artifacts from Python objects

Python objects can also be passed as artifacts, so long as they are serializable by cloudpickle.

Let's say we were working with our weights as a Python dict rather than an on-disk JSON file:

weights = {"a": 1, "b": 0.5}

This will be made available to the deployed model as a pickled file, so we must call pickle.load() to use it:

import pickle
from verta.registry import VertaModelBase

class MyModel(VertaModelBase):
    def __init__(self, artifacts):
        """
        artifacts is a dictionary where the keys represent the names of
        artifacts and values represent the locations of these artifacts
        """
        with open(artifacts["weights"], "rb") as f:
            weights = pickle.load(f)

        self.a = weights['a']
        self.b = weights['b']

    def predict(self, data):
        predictions = []
        for input in data:
            predictions.append(input*self.a + self.b)
        return predictions

Finally, instead of passing a filepath string, we pass the object itself. As mentioned before, the client will pickle this object for the deployed model to load:

model_version = registered_model.create_standard_model(
    name="v0",
    model_cls=MyModel,
    artifacts={"weights": weights},
    environment=Python([]),
    labels=["research-purpose", "team-a"],
)

An executable Notebook for this guide is available here.

Last updated