There are three groups of operations that computer vision algorithms or models do on regular images: image classification, object detection, and semantic/instance segmentation (which we will simply call “segmentation” in the future). Do not worry, you are not obliged to know about algorithm architectures!

Labeling itself is based on the model type that you use. The information your labels input to the algorithm must be of the same sort as the one you expect at the output.

Finally, the algorithm is trained on a set of images with human labeled data (training data) and learns how to predict classes, bounding boxes, or contours of previously unseen images.

Currently only training single-label and multi-label classification models is available. You can choose a custom project for training other types of models. Image classification definitions and examples are below.

  • 1. Image Classification - single class per image

    Here, each image is labeled with a class it belongs to. This is called single-label classification.


  • 2. Image Classification - multiple classes per image

    You may also want to specify more than one class for the image. This is called multi-label classification.

    While in single-label classification, the model predicts one of the specified classes that has the highest probability, a multi-label model predicts all of the specified classes that were identified with some probability higher than set threshold.

    Note: during the prediction process the optimal thresholds are set automatically, however they can be modified by user.