As mentioned in the deploying models guide, deploying models via Verta Inference is a two step process: (1) first create an endpoint, and (2) update the endpoint with a model.
This tutorial explains how Verta Inference can be used to deploy a H2O model.
An h2o serialized model (h2o_model, in this example) can be used as an artifact in a model that extends VertaModelBase.
import h2oimport osimport pandas as pdimport timefrom verta.registry import VertaModelBase, verify_ioclassH2OModelWrapper(VertaModelBase):def__init__(self,artifacts):import h2o h2o.init() self.model = h2o.load_model(artifacts["serialized_model"])@verify_iodefpredict(self,model_input):""" For compatibility with the way Verta handles model inputs and outputs and our monitoring integrations, H2OFrames should be converted into a dictionary using these function calls. """ frame = h2o.H2OFrame(pd.DataFrame(model_input)) model_out1 = self.model.predict(frame) model_out2 = model_out1.as_data_frame().to_dict("records")return model_out2# Serialize model to a file so it can be uploaded as an artifactMODEL_PATH = os.path.join(os.getcwd(), "h2o_model_file"+str(time.time()))saved_model_path = h2o.save_model(model=h2o_model, path=MODEL_PATH, force=True)model_version = client.get_or_create_registered_model(name="h2o_model").create_standard_model( model_cls=H2OModelWrapper, environment=Python(requirements=['h2o']), artifacts={"serialized_model": saved_model_path},)
Note that the input and output of the predict function must be JSON-serializable; this is what .as_data_frame().to_dict("records") aims to accomplish.
For the full list of acceptable data types for model I/O, refer to the VertaModelBase documentation.
Prior to deploy, don't forget to test your model class locally as follows.
# test locallyartifacts = model_version.fetch_artifacts(["serialized_model"])test_model =H2OModelWrapper(artifacts)test_row = h2o_df[0,:].as_data_frame().to_dict("records")test_model.predict(test_row)
To ensure that the requirements specified in the model version are in fact adequate, you may build the model container locally or part of a continuous integration system. You may also deploy the model and make test predictions as shown below.
Regardless of how a Registered Model Version has been created, the endpoint defined above can now be updated, allowing us to make predictions against it.
Since h2o models are so memory intensive, you will almost certainly need to increase your endpoint's memory allowance. See Resources documentation here.
# Customize resources to ensure the endpoint has enough memory to handle h2o's requirements# You may need to increase this value, depending on your modelresources =Resources(cpu=.25, memory="512Mi")h2o_endpoint.update(model_version, wait=True, resources=resources)deployed_model = h2o_endpoint.get_deployed_model()# Convert h2o frame to a dictionary to be compatible with Verta's model input specificationstest_row = h2o_test_df[0,:].as_data_frame().to_dict("records")h2o_model.predict(row)
The full code for this tutorial can be found here.
3. Common Problems and Solutions
1. H2O Server out of Memory
If you make a prediction and get this error:
h2o.exceptions.H2OConnectionError: Local server has died unexpectedly. RIP.
This is most likely caused by the h2o server running out of memory. To fix, increase the endpoint's memory allowance by increasing your endpoint's resources: