Intro
This article provides instructions for how to build an AI/ML model in a Jupyter Workspace. The model is also uploaded to the Domo Model Management interface where it can be deployed for real-time or batch inference.Training Data
The example is intended to be simple and will not use machine learning to train a model. Instead, given a list of colored shapes, we will define a simple algorithm that classifies the shapes as blue or not blue.| Shape | Color | Blue | |
|---|---|---|---|
| 0 | Circle | Red | 0 |
| 1 | Square | Blue | 1 |
| 2 | Oval | Green | 0 |
| 3 | Rectangle | Orange | 0 |
| 4 | Rectangle | Pink | 0 |
Hyperparameters
When using machine learning to train the model, parameters are required to configure the training process. In our example, we are not using them, but we include them here for reference.Model Training
At this point, we would normally use a machine learning library to train a model to fit our training DataSet. For the purposes of this notebook, this is the model.Validation
To ensure your model is read for deployment, we recommend testing using the invoke function. To keep things simple in this example, we will test against a training DataSet.Response
Model Schema
Each model defines an input and output type and optionally a schema for CSV or JSON types. CSVModelIOConfiguration, JSONModelIOConfiguration() For example, a model may accept a CSV as an input, and return a CSV as an output. In addition to manually creating a CSV schema, you can also create a CSV schema from a DataFrame which may be simpler if a DataFrame is in use.Metrics
During training and validation, we can define metrics to measure model performance. Example metrics are included below as a reference. In addition to metric name and value, standard deviation and timestamp may be included.Model Task
Domo lets you specify which task(s) your model is trained to perform, including:- TEXT_GENERATION
- CLASSIFICATION, or
- OTHER
Kernel Snapshots
Domo Jupyter Workspaces allow you to customize your environment by installing third-party libraries. To ensure that the model hosting environment matches your customized Jupyter environment, a snapshot is created of the conda environment running the Jupyter kernel. A kernel snapshot is automatically created the first time you create a model in a workspace. If one or more snapshots already exist, the most recent snapshot is used for your model. If your environment has changed and you need to create a new snapshot, you can callcreate\_model with create\_snapshot=True.
Creating a new snapshot can take several minutes.
Create the Model
Upload the model to the Domo Model Management interface where its performance can be compared with other models, and it may be deployed as an endpoint or DataFlow tile in Magic ETL when it is ready. The following information is included:- Name — The name of the model
- Entrypoint — The file containing our invoke function that is executed after it is deployed
- Files — The serialized model or any other files required to execute our model
- Training — Hyperparameters and metrics discovered during training
- Tasks — A list of tasks our model supports