Computer vision is among the most progressive and rapidly growing fields. According to Grand View Research, the global computer vision market size was valued at $11.32 billion in 2020 and is expected to expand at a compound annual growth rate of 7.3% from 2021 to 2028. The use cases of AI-enabled computer vision are nearly countless with the most popular being consumer drones as well as autonomous and semi-autonomous vehicles. Besides, due to the recent progress in computer vision, AI is now a necessity in various industries, such as education, healthcare, robotics, consumer electronics, retail, manufacturing, and more. So, given the abrupt evolution of computer vision, it’s important to investigate where it all started and where it’s going, especially when it comes to choosing the next computer vision project. In this article, we’ll cover the foundations and trends in computer vision.
- The evolution of computer vision
- Current challenges in computer vision
- Trend 1: Computer vision on the edge
- Trend 2: Computer vision as a service
- Trend 3: Data-centric computer vision
The evolution of computer vision
Today we’re quite used to how our smartphones use face detection or how viral Instagram face masks go. Little do we know these are examples of computer vision and what seems regular today would be impossible without profound and long-lasting research.
Computer vision started emerging in the late 1960s in universities pioneering artificial intelligence. The idea was to mimic human vision and allow computers or robots to “see” objects. A considerable amount of computer vision algorithms that exist today originated back in the 1970s. These include extraction of edges from images, labeling of lines, non-polyhedral and polyhedral modeling, clustering, optical flow, and motion estimation.
The subdomains of today’s computer vision include:
- Scene reconstruction
- Object detection
- Event detection
- Video tracking
- Object recognition
- 3D pose estimation
- Motion estimation
- Visual servoing
- 3D scene modeling
- Image restoration
Current challenges in computer vision
Though a lot of progress has been made on computer vision since the 1960s, it’s still a largely untapped field in terms of research and development. This is mainly due to the fact that human vision itself is extremely complex and computer vision systems suffer by comparison. It takes a few seconds for people to recognize their friends in the images, even at different ages, and our capacity to remember and store faces for future recognition seems unlimited. However, it’s hard to imagine the amount of work it would take a computer to tackle something nearly similar. Another challenge that computer vision engineers face nowadays is the sustainable integration of open-source computer vision tools into their applications. In particular, computer vision solutions constantly rely on both software and hardware evolution, where integrating new technologies becomes a challenging task.
Now that we covered where computer vision started and where it’s today, let’s jump into the “future” and reflect on some of the most promising trends in computer vision for 2022.
Trend 1: Computer vision on the edge
Edge is the new cloud. The term edge computing refers to a technology attached to where the data is generated, i.e., at the edge of the architecture: it allows data to be processed and analyzed where (or closer to where) it is collected, instead of the cloud or a data center. Computer vision projects are implementing edge computing architectures more and more because it solves the problems of network accessibility, bandwidth, and latency. Even cloud architectures often need to be deployed on edge devices because of privacy, robustness, and performance. Edge computing is especially popular for projects where real-time data processing is needed. Such projects include self-driving cars, drones, etc.
Edge computing has been gaining a lot of traction in the healthcare industry. While most people take vision for granted, others live their lives with limited or no vision at all. A lot of research has been made to use computer vision to assist the visually impaired. Thankfully, the advancements in technology allow us to make the world a bit of a better place for those who don’t see it through the real-time image. More precisely, computer vision can help:
- Identify objects
- Find a specific object among others
- Detect obstacles
- With sign detections and navigation
- Recognize people
- Share information about human crowds
Similar use-cases for computer vision on edge include helping physically challenged or protecting endangered species. Sure enough, the list of use-cases for edge computing goes on and on.
Trend 2: Computer vision as a service
As computer vision is gaining more traction, the number of platforms suggesting such solutions increases accordingly. Using platforms can save you some time for image processing, data labeling, and data curation. Overall, if not using a computer vision platform, you’ll have to dig much deeper and do the following:
- Developing the workflow around your AI processes.
- Getting data from different sources.
- Storing and labeling the data.
- Checking and correcting mislabeled data.
- Following versioning.
A lot of attention is now drawn into CVaaS, which stands for computer vision as a service. It enables non-AI companies to take advantage of the technological advancements and purchase pre-build algorithms available on computer vision platforms. Because the algorithms and APIs can be accessed on-demand under a pay-as-you-go model, computer vision innovation becomes both affordable and scalable. For instance, a smart move would be outsourcing data annotation services, considering that it is the first and most essential part of a successful computer vision project. Garbage in, garbage out, remember?
If you’re lucky enough to find a platform that covers your needs, stick to it (and never let it go) to make sure your computer vision project is in safe hands.
Trend 3: Data-centric computer vision
Computer vision is all about data and the model is as good as the examples you feed it with. The first step to build AI models is gathering huge datasets for training. We mistakenly believe that our model inaccuracy can only be solved by collecting large, sometimes insane quantities of data. For instance, if we’re building a model to detect rabbits, we’ll need ten thousand images of rabbits from different angles with various weather and lighting conditions; we’ll collect rabbits of different sizes and colors.
However, the tendency today goes into quality over quantity. It doesn’t mean that the amount of examples plays no role, but your training model may not necessarily benefit from a large number of training examples. Rather, it will work well if the training examples provided are accurate and informative. If we find out that our training data is inaccurate, we can either clean up the noise and find the mislabelled images. If it’s not informative enough, we can double the data set and collect another bunch of images with rabbits, or even replace the first batch. How to measure the informativeness of an image? Well, this deserves its own article.
Research shows that both of these are equally effective in terms of improving your learning algorithm performance. In most cases, it’s much easier to detect mislabeled examples and find a systematic way to label them correctly. This is the very data-centric computer vision that holds the future of successful computer vision applications. MLOps also falls under this category aimed at making the development and deployment of machine learning systems systematic, but this, again, is a whole other topic of discussion.
Key takeaways
Computer vision has come a long way and it has much more to go. The future of computer vision is promising given the current resources and talented specialists. The advancements of technology and the development of computer vision algorithms open a sea of opportunities for the application of computer vision in real life. This brings along an increase in the number of computer vision platforms, suggesting the most diverse services to build and implement a comprehensive computer vision pipeline. We’ll build computer vision applications on edge computing and we’ll focus on collecting clear and informative data on the initial steps of computer vision model training rather than collecting huge uninformative datasets full of noise.
Let us know what are your thoughts on computer vision development and we’ll cover that in our upcoming blog posts.