ExperimentRun

If you are looking for APIs to create an ExperimentRun, go to the ExperimentRun (Core) API reference.

Basic Metadata

Attributes

Attributes are descriptive metadata, such as the team responsible for this model or the expected training time.

ExperimentRun.log_attribute(key, value, overwrite=False)

Logs an attribute to this Experiment Run.

Parameters:
  • key (str) – Name of the attribute.
  • value (one of {None, bool, float, int, str, list, dict}) – Value of the attribute.
  • overwrite (bool, default False) – Whether to allow overwriting an existing atribute with key key.
ExperimentRun.log_attributes(attributes, overwrite=False)

Logs potentially multiple attributes to this Experiment Run.

Parameters:
  • attributes (dict of str to {None, bool, float, int, str, list, dict}) – Attributes.
  • overwrite (bool, default False) – Whether to allow overwriting an existing atributes.
ExperimentRun.get_attribute(key)

Gets the attribute with name key from this Experiment Run.

Parameters:key (str) – Name of the attribute.
Returns:one of {None, bool, float, int, str} – Value of the attribute.
ExperimentRun.get_attributes()

Gets all attributes from this Experiment Run.

Returns:dict of str to {None, bool, float, int, str} – Names and values of all attributes.

Hyperparameters

Hyperparameters are model configuration metadata, such as the loss function or the regularization penalty.

ExperimentRun.log_hyperparameter(key, value, overwrite=False)

Logs a hyperparameter to this Experiment Run.

Parameters:
  • key (str) – Name of the hyperparameter.
  • value (one of {None, bool, float, int, str}) – Value of the hyperparameter.
  • overwrite (bool, default False) – Whether to allow overwriting an existing hyperparameter with key key.
ExperimentRun.log_hyperparameters(hyperparams, overwrite=False)

Logs potentially multiple hyperparameters to this Experiment Run.

Parameters:
  • hyperparameters (dict of str to {None, bool, float, int, str}) – Hyperparameters.
  • overwrite (bool, default False) – Whether to allow overwriting an existing hyperparameter with key key.
ExperimentRun.get_hyperparameter(key)

Gets the hyperparameter with name key from this Experiment Run.

Parameters:key (str) – Name of the hyperparameter.
Returns:one of {None, bool, float, int, str} – Value of the hyperparameter.
ExperimentRun.get_hyperparameters()

Gets all hyperparameters from this Experiment Run.

Returns:dict of str to {None, bool, float, int, str} – Names and values of all hyperparameters.

Metrics

Metrics are unique performance metadata, such as accuracy or loss on the full training set.

ExperimentRun.log_metric(key, value, overwrite=False)

Logs a metric to this Experiment Run.

If the metadatum of interest might recur, log_observation() should be used instead.

Parameters:
  • key (str) – Name of the metric.
  • value (one of {None, bool, float, int, str}) – Value of the metric.
  • overwrite (bool, default False) – Whether to allow overwriting an existing metric with key key.
ExperimentRun.log_metrics(metrics, overwrite=False)

Logs potentially multiple metrics to this Experiment Run.

Parameters:
  • metrics (dict of str to {None, bool, float, int, str}) – Metrics.
  • overwrite (bool, default False) – Whether to allow overwriting an existing metric with key key.
ExperimentRun.get_metric(key)

Gets the metric with name key from this Experiment Run.

Parameters:key (str) – Name of the metric.
Returns:one of {None, bool, float, int, str} – Value of the metric.
ExperimentRun.get_metrics()

Gets all metrics from this Experiment Run.

Returns:dict of str to {None, bool, float, int, str} – Names and values of all metrics.

Observations

Observations are recurring metadata that are repeatedly measured over time, such as batch losses over an epoch or memory usage.

ExperimentRun.log_observation(key, value, timestamp=None, epoch_num=None, overwrite=False)

Logs an observation to this Experiment Run.

Parameters:
  • key (str) – Name of the observation.
  • value (one of {None, bool, float, int, str}) – Value of the observation.
  • timestamp (str or float or int, optional) – String representation of a datetime or numerical Unix timestamp. If not provided, the current time will be used.
  • epoch_num (non-negative int, optional) – Epoch number associated with this observation. If not provided, it will automatically be incremented from prior observations for the same key.
  • overwrite (bool, default False) – Whether to allow overwriting an existing observation with key key.

Warning

If timestamp is provided by the user, it must contain timezone information. Otherwise, it will be interpreted as UTC.

ExperimentRun.get_observation(key)

Gets the observation series with name key from this Experiment Run.

Parameters:key (str) – Name of observation series.
Returns:list of {None, bool, float, int, str} – Values of observation series.
ExperimentRun.get_observations()

Gets all observations from this Experiment Run.

Returns:dict of str to list of {None, bool, float, int, str} – Names and values of all observation series.

Tags

Tags are short textual labels used to help identify a run, such as its purpose or its environment.

ExperimentRun.log_tag(tag)

Logs a tag to this Experiment Run.

Parameters:tag (str) – Tag.
ExperimentRun.log_tags(tags)

Logs multiple tags to this Experiment Run.

Parameters:tags (list of str) – Tags.
ExperimentRun.get_tags()

Gets all tags from this Experiment Run.

Returns:list of str – All tags.

Artifacts

General

ExperimentRun.log_artifact(key, artifact, overwrite=False)

Logs an artifact to this Experiment Run.

The VERTA_ARTIFACT_DIR environment variable can be used to specify a locally-accessible directory to store artifacts.

Parameters:
  • key (str) – Name of the artifact.
  • artifact (str or file-like or object) –
    Artifact or some representation thereof.
    • If str, then it will be interpreted as a filesystem path, its contents read as bytes, and uploaded as an artifact. If it is a directory path, its contents will be zipped.
    • If file-like, then the contents will be read as bytes and uploaded as an artifact.
    • Otherwise, the object will be serialized and uploaded as an artifact.
  • overwrite (bool, default False) – Whether to allow overwriting an existing artifact with key key.
ExperimentRun.log_artifact_path(key, artifact_path, overwrite=False)

Logs the filesystem path of an artifact to this Experiment Run.

This function makes no attempt to open a file at artifact_path. Only the path string itself is logged.

Parameters:
  • key (str) – Name of the artifact.
  • artifact_path (str) – Filesystem path of the artifact.
  • overwrite (bool, default False) – Whether to allow overwriting an existing artifact with key key.
ExperimentRun.get_artifact(key)

Gets the artifact with name key from this Experiment Run.

If the artifact was originally logged as just a filesystem path, that path will be returned. Otherwise, the artifact object will be returned. If the object is unable to be deserialized, the raw bytes are returned instead.

Parameters:key (str) – Name of the artifact.
Returns:str or bytes – Filesystem path of the artifact, the artifact object, or a bytestream representing the artifact.
ExperimentRun.download_artifact(key, download_to_path)

Downloads the artifact with name key to path download_to_path.

Parameters:
  • key (str) – Name of the artifact.
  • download_to_path (str) – Path to download to.
Returns:

downloaded_to_path (str) – Absolute path where artifact was downloaded to. Matches download_to_path.

ExperimentRun.get_environment()

Gets the environment of this Experiment Run.

Returns:Python – Environment of this ExperimentRun.

Images

ExperimentRun.log_image(key, image, overwrite=False)

Logs a image artifact to this Experiment Run.

Parameters:
  • key (str) – Name of the image.
  • image (one of {str, file-like, pyplot, matplotlib Figure, PIL Image, object}) –
    Image or some representation thereof.
    • If str, then it will be interpreted as a filesystem path, its contents read as bytes, and uploaded as an artifact.
    • If file-like, then the contents will be read as bytes and uploaded as an artifact.
    • If matplotlib pyplot, then the image will be serialized and uploaded as an artifact.
    • If matplotlib Figure, then the image will be serialized and uploaded as an artifact.
    • If PIL Image, then the image will be serialized and uploaded as an artifact.
    • Otherwise, the object will be serialized and uploaded as an artifact.
  • overwrite (bool, default False) – Whether to allow overwriting an existing image with key key.
ExperimentRun.log_image_path(key, image_path)

Logs the filesystem path of an image to this Experiment Run.

This function makes no attempt to open a file at image_path. Only the path string itself is logged.

Parameters:
  • key (str) – Name of the image.
  • image_path (str) – Filesystem path of the image.
ExperimentRun.get_image(key)

Gets the image artifact with name key from this Experiment Run.

If the image was originally logged as just a filesystem path, that path will be returned. Otherwise, the image object will be returned. If the object is unable to be deserialized, the raw bytes are returned instead.

Parameters:key (str) – Name of the image.
Returns:str or PIL Image or file-like – Filesystem path of the image, the image object, or a bytestream representing the image.

Versioning

ExperimentRun.log_commit(commit, key_paths=None)

Associate a Commit with this Experiment Run.

New in version 0.14.1.

Parameters:
  • commit (verta._repository.commit.Commit) – Verta Commit.
  • key_paths (dict of key to path, optional) – A mapping between descriptive keys and paths of particular interest within commit. This can be useful for, say, highlighting a particular file as the training dataset used for this Experiment Run.
ExperimentRun.get_commit()

Gets the Commit associated with this Experiment Run.

New in version 0.14.1.

Returns:
  • commit (verta._repository.commit.Commit) – Verta Commit.
  • key_paths (dict of key to path) – A mapping between descriptive keys and paths of particular interest within commit.

Advanced Uses

Code Versioning

ExperimentRun.log_code(exec_path=None, repo_url=None, commit_hash=None, overwrite=False, is_dirty=None, autocapture=True)

Logs the code version.

A code version is either information about a Git snapshot or a bundle of Python source code files.

repo_url and commit_hash can only be set if use_git was set to True in the Client.

Parameters:
  • exec_path (str, optional) – Filepath to the executable Python script or Jupyter notebook. If no filepath is provided, the Client will make its best effort to find the currently running script/notebook file.
  • repo_url (str, optional) – URL for a remote Git repository containing commit_hash. If no URL is provided, the Client will make its best effort to find it.
  • commit_hash (str, optional) – Git commit hash associated with this code version. If no hash is provided, the Client will make its best effort to find it.
  • overwrite (bool, default False) – Whether to allow overwriting a code version.
  • is_dirty (bool, optional) – Whether git status is dirty relative to commit_hash. If not provided, the Client will make its best effort to find it.
  • autocapture (bool, default True) – Whether to enable the automatic capturing behavior of parameters above in git mode.

Examples

With Client(use_git=True) (default):

Log Git snapshot information, plus the location of the currently executing notebook/script relative to the repository root:

run.log_code()
run.get_code()
# {'exec_path': 'comparison/outcomes/classification.ipynb',
#  'repo_url': 'git@github.com:VertaAI/experiments.git',
#  'commit_hash': 'f99abcfae6c3ce6d22597f95ad6ef260d31527a6',
#  'is_dirty': False}

Log Git snapshot information, plus the location of a specific source code file relative to the repository root:

run.log_code("../trainer/training_pipeline.py")
run.get_code()
# {'exec_path': 'comparison/trainer/training_pipeline.py',
#  'repo_url': 'git@github.com:VertaAI/experiments.git',
#  'commit_hash': 'f99abcfae6c3ce6d22597f95ad6ef260d31527a6',
#  'is_dirty': False}

With Client(use_git=False):

Find and upload the currently executing notebook/script:

run.log_code()
zip_file = run.get_code()
zip_file.printdir()
# File Name                          Modified             Size
# classification.ipynb        2019-07-10 17:18:24        10287

Upload a specific source code file:

run.log_code("../trainer/training_pipeline.py")
zip_file = run.get_code()
zip_file.printdir()
# File Name                          Modified             Size
# training_pipeline.py        2019-05-31 10:34:44          964
ExperimentRun.get_code()

Gets the code version.

Returns:dict or zipfile.ZipFile
Either:
  • a dictionary containing Git snapshot information with at most the following items:
    • filepaths (list of str)
    • repo_url (str) – Remote repository URL
    • commit_hash (str) – Commit hash
    • is_dirty (bool)
  • a ZipFile containing Python source code files

Data Versioning

ExperimentRun.log_dataset_version(key, dataset_version, overwrite=False)

Logs a Verta DatasetVersion to this ExperimentRun with the given key.

Parameters:
  • key (str) – Name of the dataset version.
  • dataset_version (DatasetVersion) – Dataset version.
  • overwrite (bool, default False) – Whether to allow overwriting a dataset version.
ExperimentRun.get_dataset_version(key)

Gets the DatasetVersion with name key from this Experiment Run.

Parameters:key (str) – Name of the dataset version.
Returns:DatasetVersion – DatasetVersion associated with the given key.

Deployment

Logging

ExperimentRun.log_model(model, custom_modules=None, model_api=None, artifacts=None, overwrite=False)

Logs a model artifact for Verta model deployment.

Parameters:
  • model (str or object) –
    Model for deployment.
    • If str, then it will be interpreted as a filesystem path to a serialized model file for upload.
    • Otherwise, the object will be serialized and uploaded as an artifact.
  • custom_modules (list of str, optional) –
    Paths to local Python modules and other files that the deployed model depends on.
    • If directories are provided, all files within—excluding virtual environments—will be included.
    • If module names are provided, all files within the corresponding module inside a folder in sys.path will be included.
    • If not provided, all Python files located within sys.path—excluding virtual environments—will be included.
  • model_api (ModelAPI, optional) – Model API specifying details about the model and its deployment.
  • artifacts (list of str, optional) – Keys of logged artifacts to be used by a class model.
  • overwrite (bool, default False) – Whether to allow overwriting existing artifacts.
ExperimentRun.get_model()

Gets the model artifact for Verta model deployment from this Experiment Run.

Returns:object – Model for deployment.
ExperimentRun.download_model(download_to_path)

Downloads the model logged with log_model() to path download_to_path.

New in version 0.17.1.

Parameters:download_to_path (str) – Path to download to.
Returns:downloaded_to_path (str) – Absolute path where artifact was downloaded to. Matches download_to_path.
ExperimentRun.log_requirements(requirements, overwrite=False)

Logs a pip requirements file for Verta model deployment.

New in version 0.13.13.

Parameters:
  • requirements (str or list of str) –
    PyPI-installable packages necessary to deploy the model.
    • If str, then it will be interpreted as a filesystem path to a requirements file for upload.
    • If list of str, then it will be interpreted as a list of PyPI package names.
  • overwrite (bool, default False) – Whether to allow overwriting existing requirements.
Raises:

ValueError – If a package’s name is invalid for PyPI, or its exact version cannot be determined.

Examples

From a file:

run.log_requirements("../requirements.txt")
# upload complete (requirements.txt)
print(run.get_artifact("requirements.txt").read().decode())
# cloudpickle==1.2.1
# jupyter==1.0.0
# matplotlib==3.1.1
# pandas==0.25.0
# scikit-learn==0.21.3
# verta==0.13.13

From a list of package names:

run.log_requirements(['verta', 'cloudpickle', 'scikit-learn'])
# upload complete (requirements.txt)
print(run.get_artifact("requirements.txt").read().decode())
# verta==0.13.13
# cloudpickle==1.2.1
# scikit-learn==0.21.3
ExperimentRun.log_environment(env, overwrite=False)

Logs a Python environment to this Experiment Run.

Parameters:
  • env (Python) – Environment to log.
  • overwrite (bool, default False) – Whether to allow overwriting an existing artifact with key key.
ExperimentRun.log_setup_script(script, overwrite=False)

Associate a model deployment setup script with this Experiment Run.

New in version 0.13.8.

Parameters:
  • script (str) – String composed of valid Python code for executing setup steps at the beginning of model deployment. An on-disk file can be passed in using open("path/to/file.py", 'r').read().
  • overwrite (bool, default False) – Whether to allow overwriting an existing setup script.
Raises:

SyntaxError – If script contains invalid Python.

ExperimentRun.log_training_data(train_features, train_targets, overwrite=False)

Associate training data with this model reference.

Changed in version 0.14.4: Instead of uploading the data itself as a CSV artifact 'train_data', this method now generates a histogram for internal use by our deployment data monitoring system.

Parameters:
  • train_features (pd.DataFrame) – pandas DataFrame representing features of the training data.
  • train_targets (pd.DataFrame or pd.Series) – pandas DataFrame representing targets of the training data.
  • overwrite (bool, default False) – Whether to allow overwriting existing training data.
ExperimentRun.fetch_artifacts(keys)

Downloads artifacts that are associated with a class model.

Parameters:keys (list of str) – Keys of artifacts to download.
Returns:dict of str to str – Map of artifacts’ keys to their cache filepaths—for use as the artifacts parameter to a Verta class model.

Examples

run.log_artifact("weights", open("weights.npz", 'rb'))
# upload complete (weights)
run.log_artifact("text_embeddings", open("embedding.csv", 'rb'))
# upload complete (text_embeddings)
artifact_keys = ["weights", "text_embeddings"]
artifacts = run.fetch_artifacts(artifact_keys)
artifacts
# {'weights': '/Users/convoliution/.verta/cache/artifacts/50a9726b3666d99aea8af006cf224a7637d0c0b5febb3b0051192ce1e8615f47/weights.npz',
#  'text_embeddings': '/Users/convoliution/.verta/cache/artifacts/2d2d1d809e9bce229f0a766126ae75df14cadd1e8f182561ceae5ad5457a3c38/embedding.csv'}
ModelClass(artifacts=artifacts).predict(["Good book.", "Bad book!"])
# [0.955998517288053, 0.09809996313422353]
run.log_model(ModelClass, artifacts=artifact_keys)
# upload complete (custom_modules.zip)
# upload complete (model.pkl)
# upload complete (model_api.json)

Deploying

ExperimentRun.get_deployment_status()

Returns the current status of the model deployment associated with this Experiment Run.

New in version 0.13.17.

Returns:status (dict) –
  • 'status' (str) – Current status of the model deployment.
  • (if deployed) 'url' (str) – Prediction endpoint URL.
  • (if deployed) 'token' (str or None) – Token for authorizing prediction requests.
  • (if error during deployment) 'message' (str) – Error message from the model.
ExperimentRun.deploy(path=None, token=None, no_token=False, wait=False)

Deploys the model logged to this Experiment Run.

New in version 0.13.17.

Parameters:
  • path (str, optional) – Suffix for the prediction endpoint URL. If not provided, one will be generated automatically.
  • token (str, optional) – Token to use to authorize predictions requests. If not provided and no_token is False, one will be generated automatically.
  • no_token (bool, default False) – Whether to not require a token for predictions.
  • wait (bool, default False) – Whether to wait for the deployed model to be ready for this function to finish.
Returns:

status (dict) – See get_deployment_status().

Raises:

RuntimeError – If the model is already deployed or is being deployed, or if a required deployment artifact is missing.

Examples

status = run.deploy(path="banana", no_token=True, wait=True)
# waiting for deployment.........
status
# {'status': 'deployed',
#  'url': 'https://app.verta.ai/api/v1/predict/abcdefgh-1234-abcd-1234-abcdefghijkl/banana',
#  'token': None}
DeployedModel.from_url(status['url']).predict([x])
# [0.973340685896]
ExperimentRun.undeploy(wait=False)

Undeploys the model logged to this Experiment Run.

New in version 0.13.17.

Parameters:wait (bool, default False) – Whether to wait for the undeployment to complete for this function to finish.
Returns:status (dict) – See get_deployment_status().
Raises:RuntimeError – If the model is already not deployed.
ExperimentRun.get_deployed_model()

Returns an object for making predictions against the deployed model.

New in version 0.13.17.

Returns:DeployedModel
Raises:RuntimeError – If the model is not currently deployed.
ExperimentRun.download_docker_context(download_to_path, self_contained=False)

Downloads this Experiment Run’s Docker context tgz.

Parameters:
  • download_to_path (str) – Path to download Docker context to.
  • self_contained (bool, default False) – Whether the downloaded Docker context should be self-contained.
Returns:

downloaded_to_path (str) – Absolute path where Docker context was downloaded to. Matches download_to_path.

ExperimentRun.download_deployment_yaml(download_to_path, path=None, token=None, no_token=False)

Downloads this Experiment Run’s model deployment CRD YAML.

Parameters:
  • download_to_path (str) – Path to download deployment YAML to.
  • path (str, optional) – Suffix for the prediction endpoint URL. If not provided, one will be generated automatically.
  • token (str, optional) – Token to use to authorize predictions requests. If not provided and no_token is False, one will be generated automatically.
  • no_token (bool, default False) – Whether to not require a token for predictions.
Returns:

downloaded_to_path (str) – Absolute path where deployment YAML was downloaded to. Matches download_to_path.

Deprecated

ExperimentRun.log_model_for_deployment(model, model_api, requirements, train_features=None, train_targets=None)

Logs a model artifact, a model API, requirements, and a dataset CSV to deploy on Verta.

Deprecated since version 0.13.13: This function has been superseded by log_model(), log_requirements(), and log_training_data(); consider using them instead.

Parameters:
  • model (str or file-like or object) –
    Model or some representation thereof.
    • If str, then it will be interpreted as a filesystem path, its contents read as bytes, and uploaded as an artifact.
    • If file-like, then the contents will be read as bytes and uploaded as an artifact.
    • Otherwise, the object will be serialized and uploaded as an artifact.
  • model_api (str or file-like) –
    Model API, specifying model deployment and predictions.
    • If str, then it will be interpreted as a filesystem path, its contents read as bytes, and uploaded as an artifact.
    • If file-like, then the contents will be read as bytes and uploaded as an artifact.
  • requirements (str or file-like) –
    pip requirements file specifying packages necessary to deploy the model.
    • If str, then it will be interpreted as a filesystem path, its contents read as bytes, and uploaded as an artifact.
    • If file-like, then the contents will be read as bytes and uploaded as an artifact.
  • train_features (pd.DataFrame, optional) – pandas DataFrame representing features of the training data. If provided, train_targets must also be provided.
  • train_targets (pd.DataFrame, optional) – pandas DataFrame representing targets of the training data. If provided, train_features must also be provided.

Warning

Due to the way deployment currently works, train_features and train_targets will be joined together and then converted into a CSV. Retrieving the dataset through the Client will return a file-like bytestream of this CSV that can be passed directly into pd.read_csv().

ExperimentRun.log_modules(paths, search_path=None)

Logs local files that are dependencies for a deployed model to this Experiment Run.

Deprecated since version 0.13.13: The behavior of this function has been merged into log_model() as its custom_modules parameter; consider using that instead.

Deprecated since version 0.12.4: The search_path parameter is no longer necessary and will removed in v0.17.0; consider removing it from the function call.

Parameters:paths (str or list of str) – Paths to local Python modules and other files that the deployed model depends on. If a directory is provided, all files within will be included.

Miscellaneous

ExperimentRun.clone(experiment_id=None)

Returns a newly-created copy of this experiment run.

Parameters:experiment_id (str, optional) – ID of experiment to clone this run into. If not provided, the new run will be cloned into this run’s experiment.
Returns:ExperimentRun