- Explanation of training parameters
- The relationship of per-class and global performance statistics
- Viewing and downloading predictions on train/validation sets
- Analyzing the learning curves
- Analyzing the confusion matrix
- Using the model inside the platform or via REST API
You can download images used in this tutorial here
You can download video tutorial here
Training a single label classification model in detail video tutorial
Uploading and Labeling imagesTo train a single label classification model, first you need to upload and label your images.
Training your single label classification modelClick Train, and then choose Single-Label Classification. Before setting your model to train, you can choose your model name and train time, in minutes.
Advanced training parametersIf you switch to advanced view, you can set the Validation Set Size (%), Learning rate and Batch Size. Note: Batch size parameter is now removed. The images that you have chosen to train this model will be split into train and validation. You can decide this split using the Validation Set Size (%). The learning rate dictates the rate at which the model weights are updated during the training. The batch size decides how many images the model sees during each training step. You will be able to see the ‘Estimated training steps’ that the model will take, which is dictated by the training time you have set. You will also be able to see the ‘Estimated time to calculate ‘bottleneck’ features. This is completed on a GPU, so this will contribute towards your monthly time limit in your account.
The training processOnce you have set all of your parameters, click Start to begin training your model. You can track the progress of your model being trained at the top of the screen. Users are able to cancel the training / delete the model by clicking on the bin icon
Analyzing your model - Per-class and Global performance statisticsClick on ‘View training statistics’ to view the results of your trained model. The results are split into the Train and Validation results. Under ‘Train’, you can see the Label Count for images, as well as the Global statistics for accuracy, precision, recall and F1, as well as the per class statistics for Precision, Recall and F1 for the images that were used to train the model. In the ‘Validation’ section, it provides the same figures, but for images that were not used for training the model. Global statistics show the results for the whole dataset, whilst per class statistics are label specific. On the basic view, the global statistics are a weighted average of the per class statistics The most intuitive performance measure is Accuracy, however in some cases it is not the best measure, such as when the number of images for each label in the dataset is unbalanced. For these situations, we also provide three other performance measures, Precision, Recall and F1. You can see the definition for each statistic by clicking on the word.
Viewing and downloading predictions on train / validation setsUsers can see the actual predictions based on labels by clicking on the ‘Show Predictions’. A prediction is judged to be correct if the predicted class matches the label that was manually added by the user. You can filter the results to show only the correct or incorrect predictions. If you would like to download the predictions, you can do so by clicking on ‘Download Predictions’, which will download a .zip file that includes the resized images, ground truth labels, and the predictions.
Understanding your Best ModelOnce the model has been trained, you can see the ‘Train time’ taken, as well as the time when the Best Model was saved at. The best model is chosen by the classification error on the validation set. If you switch to the advanced view, you can see more performance statistics such as the advanced parameters including Validation set size, learning rate and batch size.
Analysing the learning curve
In addition, you can see the validation curves which show a learning curve for accuracy, and a learning curve for cross entropy loss. The red line on the learning curves show when the best model was saved. The classification error that is used to decide the best model is the Cross entropy loss statistics.
The other advanced statistics you can see in advanced view include Cross entropy loss (where lower values are better) and Matthews Correlation coefficient which is shown as a percentage.
Additionally, for Precision, Recall and F1, you can view the statistics as Micro average, Macro average and Weighted average.
Analysing the confusion matrixFinally, the confusion matrix, which shows how often images with one label are predicted to be a different label.
Using the model to make predictions
Once you are happy with your model’s performance, you can start to use it to make predictions on a new set of images. To do so, simply click on ‘Make a new Prediction’. Then, upload the images that you want to test. These images will automatically be tested, and a prediction for their classification will be generated. As there is no ground truth label to compare the prediction to, the model will list the predictions from highest to lowest, but all predictions will be the same colour. These new predictions can be downloaded in JSON format, or can be downloaded in groups sorted by the predicted label.
Users can also make predictions using your model using the REST API. For this, you will need the API token, the Project ID (both can be found under the user profile), and the model name. Note: the tutorial was filmed before the “Download offline model” feature was available.