Deploying an H2O model

As mentioned in the deploying models guide, deploying models via Verta Inference is a two step process: (1) first create an endpoint, and (2) update the endpoint with a model.

This tutorial explains how Verta Inference can be used to deploy a H2O model.

1. Create an endpoint

First, create an endpoint using Client.create_endpoint() as follows:

h2o_endpoint = client.create_endpoint(path="/h2o")

2. Updating the endpoint with a RMV

An h2o serialized model (h2o_model, in this example) can be used as an artifact in a model that extends VertaModelBase.

import h2o
import os
import pandas as pd
import time

from verta.registry import VertaModelBase, verify_io

class H2OModelWrapper(VertaModelBase):
    def __init__(self, artifacts):
        import h2o
        h2o.init()
        self.model = h2o.load_model(artifacts["serialized_model"])

    @verify_io
    def predict(self, model_input):
        """
        For compatibility with the way Verta handles model inputs and outputs 
        and our monitoring integrations, H2OFrames should be converted
        into a dictionary using these function calls.
        """
        frame = h2o.H2OFrame(pd.DataFrame(model_input))
        model_out1 = self.model.predict(frame)
        model_out2 = model_out1.as_data_frame().to_dict("records")
        return model_out2


# Serialize model to a file so it can be uploaded as an artifact
MODEL_PATH = os.path.join(os.getcwd(), "h2o_model_file" + str(time.time()))
saved_model_path = h2o.save_model(model=h2o_model, path=MODEL_PATH, force=True)

model_version = client.get_or_create_registered_model(name="h2o_model").create_standard_model(
    model_cls=H2OModelWrapper,
    environment=Python(requirements=['h2o']),
    artifacts={"serialized_model": saved_model_path},
)

Note that the input and output of the predict function must be JSON-serializable; this is what .as_data_frame().to_dict("records") aims to accomplish.

For the full list of acceptable data types for model I/O, refer to the VertaModelBase documentation.

Prior to deploy, don't forget to test your model class locally as follows.

# test locally
artifacts = model_version.fetch_artifacts(["serialized_model"])
test_model = H2OModelWrapper(artifacts)
test_row = h2o_df[0, :].as_data_frame().to_dict("records")
test_model.predict(test_row)

To ensure that the requirements specified in the model version are in fact adequate, you may build the model container locally or part of a continuous integration system. You may also deploy the model and make test predictions as shown below.

Regardless of how a Registered Model Version has been created, the endpoint defined above can now be updated, allowing us to make predictions against it.

Since h2o models are so memory intensive, you will almost certainly need to increase your endpoint's memory allowance. See Resources documentation here.

# Customize resources to ensure the endpoint has enough memory to handle h2o's requirements
# You may need to increase this value, depending on your model
resources = Resources(cpu=.25, memory="512Mi")
h2o_endpoint.update(model_version, wait=True, resources=resources)

deployed_model = h2o_endpoint.get_deployed_model()
# Convert h2o frame to a dictionary to be compatible with Verta's model input specifications
test_row = h2o_test_df[0, :].as_data_frame().to_dict("records")
h2o_model.predict(row)

The full code for this tutorial can be found here.

3. Common Problems and Solutions

1. H2O Server out of Memory

If you make a prediction and get this error:

h2o.exceptions.H2OConnectionError: Local server has died unexpectedly. RIP.

This is most likely caused by the h2o server running out of memory. To fix, increase the endpoint's memory allowance by increasing your endpoint's resources:

resources = Resources(cpu=.25, memory="512Mi")
h2o_endpoint.update(model_version, wait=True, resources=resources)

Last updated