Topics covered:
  • Labeling by image similarity feature
  • Changing parameters
  • Adjusting suggested labels manually
  • Performing AI-assisted labeling iteratively
  • Downloading classification labels

You can download video tutorial here

Label by Similarity Search Transcript

Introduction
Welcome to this label by similarity search tutorial. This is a very useful feature if you want to label your images for classification because you can use ai assisted labelling capabilities even without training any model.
What you will need to Label by Similarity
To label images by similarity, first you need a project where some of your images are already labelled. Then you can click image similarity, label by similarity, upload images, and select some unlabeled images from your computer. You can change some of the parameters but i’ll explain them later then click confirm, here you can see the results of image labeling by similarity, on the left side you can see the query images that you uploaded.
Label Thresholds
On the right side, you can see the most similar images in your data set that were found for this image. Since these images are labeled, the suggested label for your query image will depend on the labels of already labeled images, in this case the top 10 images have the same label, so you have only one suggested label for your image. In some other cases, there might be several different labels for the similar images. By default, we use score threshold 30% and all labels that are above 30% are by default checked since none of those labels are above 30%, none of them are checked however you can check one of them manually. The score threshold for images to be labeled by default can be changed here, so if i set this one to 20% for example, both chanterelle and honey fungus will be checked.
Top Scoring Labels
Also, you can click to mark only the top scoring label and always only the top scoring label will be checked in your results. This is useful if you are labeling images for single labeling classification. By default, we use 10 most similar images to suggest a label for the query image, however in some cases it might occur that only the top one or two images have that correct label, as in this case. You can change the number of images that we are using to suggest the score label here, for example if i changed to two, only the top two images will be used for suggesting the label for the query image. You can also change it to only the most useful images will be used to suggest the label for the query image. You can also set it to a higher number, let’s say 20 so in this case even more images will be used to suggest the label for the query image. Please note that changing either the number of results or the score threshold will cancel any manual changes that you have made. For example, if I change it here, all the checkboxes that I checked will be reset.
Adding images to the dataset
Once you have manually reviewed and corrected all the suggested labels, you can click select all images and add to the data set, to add your results to your data set. You will see the labeled images marked by auto labeled mark in your data set. Another way to use a label by similarity tool, is to use it on images that are already uploaded to your data set, for this you can first filter images that do not contain classification labels in your data set, then select either all or some of those images, right click on the image, click ai tools and label by similarity. The rest of the process is similar as I explained before. The beauty of labeling images by similarity, is that it can be done interactively, for example in the previous step I have labeled these images by similarity.
Suggestions for new query images
Now when I label these images by similarity, the previously labeled images will already be used to give the suggestions for the new query images. So our suggestion is just to upload a lot of unlabeled images, filter out the images that do not contain declassification labels and label them in batches. For example, 10 or 20 images each time and that way each subsequent batch will be labeled more and more accurately automatically without any need for manual review. Of course you can also organize batches by uploading images from your PC by clicking here and selecting upload images. Each time you can upload 10 or 20 or any number of images you think is suitable for the batch size in your case.
Downloading the Images
Don’t forget that after you finish labeling your images for classification, you can always select all images and download the classification labels by right clicking on one of the images, clicking image operations, download, and selecting download classification labels. A zip file containing the classification labels in csv and json format will be downloaded. If you want, you can also use our platform to train a classification model and deploy your model on the cloud. So this is all I wanted to share in this tutorial about labeling images by similarity.