Yesterday we have released a new version of our platform with updates and performance improvements of our image annotation service. Our labeling tool grows in its functionality day by day, so we have decided to write a more extensive blog post about challenges in picture annotation and how you can use SentiSight to solve them.
Image annotation is a process of classifying images and creating labels to describe objects within them. It is a crucial stepping stone in a supervised machine learning project because the quality of the initial data determines the quality of the final model. A mislabeled image could lead to the model getting trained incorrectly, consequently producing undesirable results. To develop a neural network model well, data scientists are collecting vast amounts of data that contains hundreds of images. Therefore, labeling all of them correctly is a tedious, resource-heavy and lengthy process. The more people are working on the same project annotating, the more confusing it can get. Images can get duplicated, mislabeled or not labeled at all. Therefore, having a good management system is a must. To make the image annotation process more efficient programmers have developed numerous data labeling tools that allow for quicker and more precise annotation. One of these powerful tools, called SentiSight.ai is being offered by us.
In computer vision, there are several labeling options, ranging from the simplest to the most complex. The most commonly used type of annotation is bounding boxes. It works by applying rectangular boxes used to determine the location of an object and is represented by four coordinates marking its corners. The SentiSight.ai platform also provides a tool to add key points to the bounding box which, after saving, can be reused in the same order on another selection. However, using bounding boxes is not a very precise way to label images, since most of the objects are not rectangular.
Alternatively, our platform provides a possibility to accurately define the object by creating complex polygons around it. Editing them is possible by adding, moving or removing specific anchor points. The polygon annotation tool is useful while working with occluded objects since it can combine separate items into one joint structure or cut out holes in the initial selection.
If one needs to select an object of a complex shape, adding numerous separate points to form a polygon might become too difficult. For this task, free-form hand drawing bitmaps seems to be an easier option. It works similarly to a paintbrush, masking the selected area with a specific color that can then be converted to a polygon and vice versa to speed up the labeling process. The tool can be used as a drawing brush or an eraser, which helps to maintain accuracy. In case of objects being even more complex, SentiSight.ai offers an AI-assisted smart labeling feature similar to the ‘Magic Wand Tool’ in Adobe Photoshop which can complete the task for the user and speed up the process. The only thing that needs to be done by the user is choosing the object area, vaguely defining both the foreground and background of the image using the given selection tools and then the AI algorithm does the rest by extracting the object. The tool works best with high contrast images, but even when the quality of the picture is only subpar, the process can be repeated as many times as needed to reach a satisfactory level.
By offering a user-friendly interface, SentiSight.ai makes image annotation tasks a lot easier. The visibility for all labels can be turned on or off and their opacity can be adjusted to help see objects behind them better. The tools have keyboard shortcuts that significantly speed up the labeling process. After finishing a task, all the selected labels can be downloaded in JSON or CSV formats. The colored bitmaps needed for semantic segmentation used in self-driving cars and robotics can be downloaded with the original images as well, the same goes for black and white bitmaps, used in instant segmentation. The platform is suitable both for beginners, providing a straightforward user guide as well as video tutorials, and experts who can improve their experience by using advanced features. Depending on the size of a project, SentiSight.ai is free to start, making it accessible to anyone willing to give it a shot.
To train a deep neural network model well, large datasets containing thousands, sometimes even millions of annotated pictures are needed. Labeling them one by one even with the help of useful tools is still time-consuming. Therefore, SentiSight.ai presents AI-assisted labeling that enables an iterative labeling process. To be able to work with it, the user needs to annotate a small number of images and train a neural network model with them, which can then be used to make predictions on the rest of the dataset. The platform then allows the annotator to review the results and correct specific labels if needed. Already having annotations to choose from does not require the user to spend too much time coming up with new ones. Afterward, the images can be added to the training set so that a more accurate version of the model could be trained. Repeating the process with more pictures a few more times provides the user with better results with every iteration.
As we are entering a new era of AI technology, using deep learning algorithms for autonomous vehicles, facial recognition and robotics, image labeling tasks are becoming more important than ever. Some projects require a huge number of images to be labeled, which cannot be done by one or two labelers. For this reason, not having a strong management system can prove to be a challenge for large teams working on a single project. While working on our platform, users can share projects within their team, supervisors can add new roles and manage permissions. They can also track the time spent labeling images, either by checking day to day statistics or the overall time spent on a project. These summaries can also be downloaded in CSV format. The platform allows users to filter images by type, as in ‘seen by you’, ‘labeled by you’, ‘marked as validation set’, etc, by names and by image status, which can be set to ‘seen’ or ‘labeled’ by particular team members to avoid duplication, track your work and remind what has already been done. If desired, images can be marked as ‘seen’ or ‘unseen’ by the user if the project guidelines have changed and annotation revisions are needed. Since SentiSight.ai is an online tool, everyone’s work can be checked by their supervisor in real-time.
The ability to see and understand what we see comes naturally to humans, and so we tend to take it for granted. However, teaching someone else, especially a machine, to perceive things as we do is a long and laborious process. In computer vision, correctly describing the ground truth is a critical task requiring careful consideration. Therefore, to make it easier and speed up the process, image labeling platforms such as SentiSight.ai are here to help by offering powerful tools, assisting to improve the technology we currently have.