Interactive segmentation plays an integral role in the field of artificial intelligence and machine learning, as it facilitates accurate and efficient labeling of raw data. It optimizes the labeling process, allows data scientists to create high-quality training data for machine learning models faster, and provides an interactive auto-labeling interface.
In this article, we will discuss:
- The importance of AI-assisted data labeling and its impact on the data labeling industry
- Companies Providing AI-assisted data labeling
- Click-based approach
- Box-based Approach
- Edge point-based approach
- Superpixel-based Approach
- Scribble-based Approach
- Key takeaways
The importance of AI-assisted data labeling
Data labeling is a critical aspect of machine learning algorithms as it serves as the foundation for training data. However, manual data labeling can be time-consuming and prone to human errors. This is where AI-assisted data labeling comes into play, providing human-computer interaction auto-labeling interface that combines the strengths of human labelers and machine learning models. AI-assisted labeling helps bridge the gap between manual and automated data labeling, delivering efficiency gains and improving the overall quality of labeled data.
Interactive segmentation enables AI-assisted labeling by integrating visual labeling assistance with ongoing human labeling activity. By applying complex machine learning algorithms, the system can predict the accurate segmentation mask of an object and actively learn from manual labels, thereby improving its performance over time.
AI-assisted data labeling tools, more commonly referred to as interactive segmentation in the scientific community are an important research topic in computer vision and have many practical applications in fields like robotics, self-driving cars, medical and aerial imaging, etc. One of the key factors that companies consider when developing AI-assisted data labeling tools is the acceleration of the annotation process, primarily for the semantic and instance segmentation tasks.
There are several methods to measure the accuracy of an AI-assisted data labeling algorithm. More common ones include precision, recall, F1, and NoC scores. These metrics are used to measure the degree of overlap between the segmented object and the ground truth object, or the number of interactions needed to ready the desired level of accuracy.
In recent years, there have been several techniques that researchers used to improve the quality of such algorithms in a given fixed timeline. The algorithmic progress is indeed fascinating, just like other fields of ML algorithms (see below the NOC@90 rate decrease over time). The NoC is defined as the number of Clicks needed to reach 85 or 90 percent of the IOU score against the ground truth annotations.
In this article, we will discuss several different AI-enabled data labeling tools for images, examine their advantages and disadvantages and provide detailed reasoning for why we decided to build the new interactive tool that accelerates the annotation process.
Companies providing AI-assisted data labeling
Many companies provide data labeling software, but only a handful provide AI-assisted data labeling functionality. Below is a list of such companies with the approach they take when it comes to AI-assisted labelling. providing sucy as SuperAnnotate, Scale AI, V7 Labs, Hasty AI, Segments AI, Supervisely, and Labelbox.
1. SuperAnnotate: Initially founded as a Ph.D. research in 2018, SuperAnnotate grew rapidly and established itself as one of the top leading data labeling software. SuperAnnotate's software focuses on 5 components of the AI lifecycle, including data labeling, AI dataset management, AI model management, automation, and annotation service marketplace. When it comes to AI-assisted data labeling approaches, we offer both scribble-based and superpixel-based approaches.
2. Scale AI: Established in 2016, Scale AI offers wide range of data labeling solutions including AI data management platform (Nucleus). In terms of AI-assisted data labeling approaches, Scale AI only offers a box-based approach.
3. V7 Labs: Founded in 2018, V7Labs focuses on visual data, primarily solving computer vision problems. It has a points-based approach.
4. Hasty AI: Similar to V7, Hasty AI also offers data labeling platform primarily based on visual data (image annotation). Hasty AI provides an edge points-based approach to its users. As of 2023, the company was acquired by a large annotation service company - CloudFactory.
5. Segments AI: Segments AI was founded to provide data labeling software for robotics and AV companies to build better datasets. Segments AI provides a superpixel-based method as their offering for AI-assisted data labeling techniques.
6. Supervisely: Supervisely was the earliest in the market when offering an AI-assisted labeling platform. They offer annotation platform for images, videos, LiDAR files as well as collaboration features for efficient workforce management. Supervisely is using the points-based approach.
7. Labelbox: Founded in 2017, Labelbox is a training data platform that provides data labeling and data management tools for various data types. They offer a single AI-assisted data labeling approach, that is the box-based approach.
Click-based approach
One of the easiest approaches to implementing AI-assisted data labeling is based on the so-called clicked approach which occurs when the human labeler receives the raw images and gets to identify the object by clicking on it.
In general, the more interactions a user provides, the more accurate the segmentation is likely to be. On the other hand, the number of clicks increases the annotation time linearly. At the same time, several other issues come with the linear time increase.
Some of which are:
- Deciding when to stop adding more clicks (this decision-making process is happening with every object and with every click, they are annotating)
- After each click, the annotators have to receive and process the new prediction mask, then decide where they need to put the next point and whether it should be a positive or negative click.
This can extend the whole annotation process as it adds extra seconds and makes it longer to select an object accurately.
Box-based approach
Another alternative is to predict a mask within the selected box and use smart AI algorithms to adjust the edges in a way that would minimize the touches on those edges. However, such techniques inherently come with even worse issues than the previous methods since the AI algorithm often results in a worse mask instead of correcting the rest of the polygon points. These techniques were primarily developed and further fine-tuned by NVIDIA and the University of Toronto.
Video source: Efficient Interactive Annotation of Segmentation Datasets with Polygon-RNN++
Edge point-based approach
The third approach around interactive image segmentation is called the Deep Extreme Cut. The initial algorithm was developed within the ETH Computer Vision group which uses 4 extreme points on the edges of the object to predict the musk of the objects. This approach does seem like a great solution but in reality, it lacks accuracy in the prediction musk. Adding additional 4 extreme points on the edges of the object can be very time-consuming. Other than the 4-point requirement, the generic method does not allow good editing techniques in case the musk is not accurately predicted. Deep Extreme Cut is implemented in a few companies providing annotation platforms such as Hasty.ai and Scale.ai.
Superpixel-based approach
The fourth and probably the most efficient approach so far is the usage of superpixel algorithms to over-segment the image into several smaller segments and use them interactively to annotate the image. This was the initial approach that SuperAnnotate took when we first started which was something I was working on during my Ph.D. studies. It brought several-fold acceleration for a few annotation tasks, especially the ones where the edges of the object are quite visible. However, one of the major disadvantages of this approach is the fact that it requires a lot of time for the annotators to get used to changing the number of segments to minimize the number of clicks that are needed to select the object. This is similar to the first approach where the annotators are deciding to choose an optimal number of segments on every object, which sometimes can be cumbersome for the annotation.
We released our superpixel segmentation algorithm late in 2019 and were the first ones with this approach. Other startups using a similar approach include Segments.ai.
Scribble-based approach
As we discussed earlier, there is a tradeoff between accuracy and the time required to complete annotating an image using AI-assisted data labeling methods. As such, researchers are often interested in developing algorithms that can achieve high accuracy with as few interactions as possible. However, they generally skip the details of how people interact with their algorithm and the number of interactions is not always directly proportional to the annotation time.
It was very much the same for me during my Ph.D. studies. I never considered the UI or UX when building an algorithm that can increase the efficiency of the annotators. All this time, my team kept thinking about some potential ways to create a new approach that would not inherit any of the potential issues discussed earlier. Since none of them offered us any valid solutions, it took some time until we managed to improve our original superpixel approach.
The Scribble-based approach takes an input of several diverse points extracted from scribble done on the object. We then take the positive points and use Gaussian Processes to predict the negative points as well. The combination of positive and negative points is fed to the machine learning model to auto-label the mask of the object. Such an approach is really making the annotation process, fast, intuitive, and extremely accurate which we did not have with other approaches up until now.
Key takeaways
Overall, the field of AI-assisted data labeling is a rich area of research with many potential applications. As researchers continue to develop new algorithms and techniques, we can expect to see more accurate and efficient methods for image annotation. The word efficient in such cases can be defined differently by the research community. Although the scientific community will oppose the approach we provide to be a paper, we at SuperAnnotate firmly believe that the right UI and UX experience is as important to the AI-assisted labeling process for the accuracy of the algorithm fueling the prediction mask. With our new Scribble to Polygon tool, we show how we could excel in both areas and build the fastest and most accurate tool that is both intuitive and user-friendly.