Logging and querying model metadata
Setup
First, let's install a machine learning library to work with:
...then launch the Python interpreter:
We begin with the Client
:
host
points the client to the Verta back end, email
is the address you have associated with your GitHub account, and dev_key
is your developer key which you can obtain though the Verta Web App.
Your email and developer key can also be set using the environment variables $VERTA_EMAIL
and $VERTA_DEV_KEY
, so you don't have to explicitly type them into your workflow.
Once a client is instantiated and a connection is established, you can create Verta entities to organize your work:
A project is a goal. We're going to classify multiple handwritten digits.
An experiment is a strategy for that goal. We'll use a support vector machine as our classification model.
An experiment run is an execution of that strategy. We'll train a support vector machine using the radial basis function kernel.
Note that you are not restricted to any naming conventions here. Feel free to use names that you consider useful and meaningful.
If you'd like, you could also add a description, tags, and attributes:
Run tracking
scikit-learn has built-in datasets we can use:
We also need to define some hyperparameters to specify a configuration for our model:
Then we can finally train a model on our data:
To see how well we did, we can calculate our mean accuracy on the entire training set:
That's not much better than purely guessing! So how do we keep a more permanent record of this abysmal experiment run? With Verta of course:
But logging doesn't need to occur all at once at the end. Let's do another experiment run with a linear kernel—this time interweaving the logging statements with our training process:
Querying
Organizing _experiment run_s under _experiment_s gives us the ability to retrieve them as a group:
...and query them:
That's pretty good! So which run was this? Definitely not the RBF kernel:
Reproducing
We can load back the model to see it again for ourselves:
Or we can retrain the model from scratch as a sanity check:
Last updated