Introduction to Computer Vision & Camera Calibration for Self Driving Cars

Camera Calibration, Perspective Transform and Distortion Correction for Self Driving Cars Development

Prateek Sawhney
7 min readSep 13, 2021

Welcome to this medium article. Computer Vision can essentially be broken down into a three step cycle. Sensing the world around is the first step by the self driving car. Based on that perception of the world, the self driving car decides what to do. In the last stage, the autonomous car performs an action based on the decision it made in the previous step. This constitutes the whole computer vision used by the self driving car.

Detecting Lane Lines (Image by author)

Computer vision is a major part of the perception step in that cycle. 80% of the challenge of building a self-driving car is perception. Computer vision is the technique of recognizing the world around us through images or videos. For the self driving cars or the autonomous vehicles, computer vision performs most important steps or processes like detecting lane markings, vehicles or pedestrians, and other elements like Traffic signs, etc.

Self-driven cars employ a suite of sophisticated sensors, but humans do the job of driving with just two eyes and one good brain. In fact, we can even do it with one eye closed. Indeed, we can. So, let’s take a closer look at why using cameras instead of other sensors might be an advantage in developing self-driving cars. Radar and Lidar see the world in 3D, which can be a big advantage for knowing where we are relative to our environment. We know that a camera only sees 2d world around it, but at much higher spatial resolution than Radar and Lidar. Due to this, it’s actually possible to infer depth information from camera images. The big difference, however, comes down to cost, where cameras are significantly cheaper.

Readings like how much the lane is curving are important for the cameras to perform the required thing. But, in order to get this perspective transformation right, we first have to correct for the effect of image distortion. Well hopefully, the distortion we’re dealing with isn’t quite that bad, but yes, that’s the idea. Cameras don’t create perfect images. Some of the objects in the images, especially ones near the edges, can get stretched or skewed in various ways and we need to correct for that. Cool, let’s jump into step one, how to undistort our distorted camera images.

Distortion Correction

Distortion correction comes into play when we want to retrieve the information that is left by our camera which is located on the hood of the car. This is because, an image taken by a camera suffers distortion. A camera looks at 3d objects and converts them into 2d image. This transformation isn’t perfect. For example, here’s an image of a road and some images taken through different camera lenses that are slightly distorted.

Distorted image (Image by author)

In this distorted image, we can see that the edges of the lanes are bent and sort of rounded or stretched outward. And distortion is actually changing what the shape and size of these objects appears to be. This is a great problem because we want our self driving vehicle to be accurate on the road so that there are less accidents. But if the lane is distorted, we’ll get the wrong measurement for curvature in the first place and our steering angle will be wrong. So, first of all, we should eliminate this distortion by computer vision which is caused by the cameras located on the hood of the car. Below is an image which is distortion corrected:

Distortion Corrected (Image by author)

Pinhole Camera Model and Types of Distortion

Before start correcting for distortion, let’s get some intuition as to how this distortion occurs. Now, lets take a look at the pinhole camera model. When the camera forms an image, it’s looking at the world similar to how our eyes do. In the pinhole camera model, the image it forms here will be upside down and reversed because rays of light that enter from the top of an object will continue on that angled path through the pinhole and end up at the bottom of the formed image. Similarly, light that reflects off the right side of an object will travel to the left of the formed image.

We need undistorted images that accurately reflect our real world surroundings. Luckily, this distortion can generally be captured by five numbers called distortion coefficients, whose values reflect the amount of radial and tangential distortion in an image. In severely distorted cases, sometimes even more than five coefficients are required to capture the amount of distortion. If we know these coefficients, we can use them to calibrate our camera and undistort our images.

Camera Calibration Using Chessboard Images

The first step will be to read in calibration images of a chessboard. There should be at least 20 images to perform calibration. They should be taken at different angles and distances. Each chessboard here has eight by six corners to detect. I’ll go through the calibration steps for the first calibration image in detail.

Camera Calibration performed on the Image (left) to get the distortion corrected image (right)

Lane Curvature

Now that we’ve learned about camera calibration and correcting for distortion, we can start to extract really useful information from images of the road. One really important piece of information is lane curvature. While driving on the highway, we press the gas or break to go with the flow and take a look at the traffic. And, based on such information we steer the wheel.

Now, how does this work for a self-driving car?

Autonomous vehicles must be supplied by the correct steering angle so that they can turn left or right and we can easily calculate this angle.

To determine the curvature, we’ll go through the following steps.

First, we’ll detect the lane lines using some masking and thresholding techniques.

Then, perform a perspective transform to get a birds eye view of the lane. This let’s us fit a polynomial to the lane lines. Which we couldn’t do very easily before.

Then, we can extract the curvature the lines from this polynomial with just a little math. So let’s start by learning more about the perspective transform.

Perspective Transform

In an image, perspective is the phenomenon where an object appears smaller the farther away it is from a viewpoint like a camera, and parallel lines appear to converge to a point. It’s seen in everything from camera images to art. Many artist use perspective to give the right impression of an object’s size, depth, and position when viewed from a particular point. And let’s look at perspective in this image of the road.

Sample undistorted image for implementing perspective transform (Image by author)

As you can see, the lane looks smaller and smaller the farther away it gets from the camera, and the background scenery also appears smaller than the trees closer to the camera in the foreground. A perspective transform of an image gives us the same image but from another view point like the bird’s eye view, etc. It forces the same image to change it’s view point and warps the new image over it. The Perspective Transform of the above is given below:

Perspective Transform or the Bird’s eye view (Image by author)

By doing a perspective transform and viewing this same image from above, we can see that the lanes are parallel and both curve about the same amount to the right. So, a perspective transform let’s us change our perspective to view the same scene from different viewpoints and angles. This could be viewing a scene from the side of a camera, from below the camera, or looking down on the road from above. Doing a bird’s-eye view transform is especially helpful for road images because it will also allow us to match a car’s location directly with a map, since map’s display roads and scenery from a top down view. The process of applying a perspective transform will be kind of similar to how we applied undistortion. But this time, instead of mapping object points to image points, we want to map the points in a given image to different desired image points with a new perspective.

Detected Lane Lines (Image by author)

--

--

Prateek Sawhney

AI Engineer at DPS, Germany | 1 Day Intern @Lenovo | Explore ML Facilitator at Google | HackWithInfy Finalist’19 at Infosys | GCI Mentor @TensorFlow | MAIT, IPU