Computer Vision

What is Computer Vision ?


Computer Vision, often abbreviated as CV, is defined as a study that to develop techniques to help computers see and understand the digital images and videos. Computer vision is a field of computer science that works on enables computer to see the objects, identify and process images like in the same way that human vision does, and then provide output as human vision gives. In reality, it is difficult task to enable computer to recognize images of different objects, and It is linked with Artificial intelligence.

Computer vision is a field of artificial intelligence that trains computers to interpret and understand the visual world. Using digital images from cameras and videos and deep learning models, machines can accurately identify and classify objects
For example, cars could be fitted with computer vision which would be able to identify and distinguish objects on and around the road such as traffic lights, pedestrians, traffic signs and so on, and act accordingly. The intelligent device could provide inputs to the driver or even make the car stop if there is a sudden obstacle on the road.

When a human who is driving a car sees someone suddenly move into the path of the car, the driver must react instantly. In a split second, human vision has completed a complex task, that of identifying the object, processing data and deciding what to do. Computer vision's aim is to enable computers to perform the same kind of tasks as humans with the same efficiency.

History of Computer Vision

Early experiments in computer vision took place in the 1950s, using some of the first neural networks to detect the edges of an object and to sort simple objects into categories like circles and squares. In 1966, Seymour Papert and Marvin Minsky, two pioneers of artificial intelligence, launched the Summer Vision Project, a two-month, 10-man effort to create a computer system that could identify objects in images.

In the 1970s, the first commercial use of computer vision interpreted typed or handwritten text using optical character recognition. This advancement was used to interpret written text for the blind. In 1979, Japanese scientist Kunihiko Fukushima proposed the neocognitron, a computer vision system based on neuroscience research done on the human visual cortex. Although Fukushima's neocognitron failed to perform any complex visual tasks, it laid the groundwork for one of the most important developments in the history of computer vision.

As the internet matured in the 1990s, making large sets of images available online for analysis, facial recognition programs flourished. These growing data sets helped make it possible for machines to identify specific people in photos and videos.

To accomplish the task, a computer program had to be able to determine which pixels belonged to which object. This is a problem that the human vision system, powered by our vast knowledge of the world and billions of years of evolution, solves easily. But for computers, whose world consists only of numbers, it is a challenging task.

At the time of this project, the dominant branch of artificial intelligence was symbolic AI, also known as rule-based AI: Programmers manually specified the rules for detecting objects in images. But the problem was that objects in images could appear from different angles and in various lighting. The object might appear against a range of different backgrounds or be partially occluded by other objects. Each of these scenarios generates different pixel values, and it's practically impossible to create manual rules for every one of them.

How computer vision works

Today’s AI systems can go a step further and take actions based on an understanding of the image. There are many types of computer vision that are used in different ways:

Image segmentation partitions an image into multiple regions or pieces to be examined separately.

Object detection identifies a specific object in an image. Advanced object detection recognizes many objects in a single image: a football field, an offensive player, a defensive player, a ball and so on. These models use an X,Y coordinate to create a bounding box and identify everything inside the box.

Facial recognition is an advanced type of object detection that not only recognizes a human face in an image, but identifies a specific individual.

Edge detection is a technique used to identify the outside edge of an object or landscape to better identify what is in the image.

Pattern detection is a process of recognizing repeated shapes, colors and other visual indicators in images.

Applications of Computer Vision

Many of the applications you use every day employ computer-vision technology. Google uses it to help you search for objects and scenes - say, "dog" or "sunset" - in your Images library.

Other companies use computer vision to help enhance images. One example is Adobe Lightroom CC, which uses machine-learning algorithms to enhance the details of zoomed images. Traditional zooming uses interpolation techniques to color the zoomed-in areas, but Lightroom uses computer vision to detect objects in images and sharpen their features when zooming in.

One field that has seen remarkable progress thanks to advances in computer vision is facial recognition. Apple uses facial-recognition algorithms to unlock iPhones. Facebook uses facial recognition to detect users in pictures you post online. In China, many retailers now provide facial-recognition payment technology, relieving their customers of the need to reach into their pockets.

Content moderation is another important application for computer vision. Most social-media networks use deep-learning algorithms to analyze posts and flag those that contain banned content. Self-driving cars also rely heavily on computer vision to make sense of their surroundings. Deep-learning algorithms analyze video feeds from cameras installed on the vehicle and detect people, cars, roads, and other objects to help the car navigate its environment.


 

 

 

 

Post a Comment

0 Comments