Computer Vision Fundamentals — Self Driving Cars (Finding Lane Lines)
This medium article focusses on Computer Vision Techniques about the working of Self Driving Cars.
Welcome to this medium article. Computer Vision is an amazing field. We’re going to learn in this article things like edge extraction and colors spaces, stuff that Computer Vision people know in and out. The most important thing to remember is, it’s actually really easy to get started, and it’s really important to have fun 😄
Making computer see is an amazing skill. Our job is to teach the car how to drive itself, and in order to do that, we’re going to need to teach the car how to perceive the world around it. Now, when I drive, I use my eyes to figure out how fast to go and where the lane lines are and where to turn. A car doesn’t have eyes. But, in a self-driving car, we can use cameras and other sensors to achieve a similar function.
That’s right. So, let’s think about what those cameras are seeing as we drive down the road. You and I can see where the lane lines are automatically but we need to teach the car how to do that. So, our goal is to write code to identify and track the position of the lane lines in a series of images.
Here’s a picture of a stretch of wide open highway.
What kinds of features do you think would be helpful to figure out where the lane lines are in this image?
For starters, let’s try finding the lane lines using color. The lane lines are white, so how do we select the white pixels in an image?
To select a color, we first need to think about what color actually means in the case of digital images. In this case, it means that our image is actually made up of a stack of three images, one each for red, green and blue. These images are sometimes called color channels. Each of these color channels contains pixels whose values range from zero to 255, where zero was the darkest possible value and 255 is the brightest possible value. If zero is dark and 255 is bright, what color would represent pure white in our combined red, green and blue image?
Now, let’s focus on just the region of the image that interests us, namely, the region where the lane lines are. In this case, you can assume that the camera that took the image is mounted in a fixed position on the front of the car. Therefore, the lane lines will always appear in the same general region of the image.
When we talk about computer vision, we’re literally talking about using algorithms to let a computer see the world like we see it, full of depth and color and shapes and meaning. Computer Vision is a really broad field and right now we’re just going to focus on a few features.
At this point, I’ll introduce you to two powerful techniques that you’ll use for this task, where you’ll be identifying the lane lines in an image of the road.
Canny Edge Detection
The first technique we’ll look at, is Canny Edge Detection. So let’s give it a go.
Looking at a greyscale image I see bright points, dark points and all the gray in between. Rapid changes in brightness are where we find the edges. Our image is just a mathematical function of x and y so we can perform mathematical operations on it just like any other function.
For example, we can take its derivative which is just a measure of change of this function. A small derivative means small change, big derivative, big change. Images are two dimensional, so it makes sense to take the derivative with respect to x and y simultaneously. This is called the gradient and in computing it, we’re measuring how fast pixel values are changing at each point in an image and in which direction they’re changing most rapidly.
Computing the gradient gives us thick edges. With the Canny algorithm, we will thin out these edges to find just the individual pixels that follow the strongest gradients. We’ll then extend those strong edges to include pixels all the way down to a lower threshold that we defined when calling the Canny function.
The name Canny refers to John F. Canny, who developed this edge detection algorithm in 1986. With edge detection, the goal is to identify the boundaries of an object in an image. So, to do that, first, I’ll convert to grayscale. All right.
And next, I’ll compute the gradient. Nice. Okay. So now, looking at the gradient thing, where the brightness of each pixel corresponds to the strength of the gradient at that point. We’re going to find edges by tracing out the pixels that follow the strongest gradients. All right. But, the gradient world is a little scary :D
By identifying edges, we can more easily detect objects by their shape. So, what exactly is an edge? In this case, I’m applying the Canny function to an image called gray and the output will be another image called edges. Low threshold and high threshold determine how strong the edges must be to be detected.
You can think of the strength of an edge as being defined by how different the values are in adjacent pixels in the image; really just the strength of the gradient. Next, I’ll show you how that works, so that you have a clear picture of what’s going on under the hood when you use the Canny edge detection method in OpenCV.
Hough Transform
Okay, so now I’ve taken a greyscaled image, and using edge detection, turned it into an image full of dots, but only the dots that represent edges in the original image. Now let’s connect the dots. I could connect the dots to look for any kind of shape in my image. But in this case I’m looking for lines.
To find lines, I need to first adopt a model of a line and then fit that model to the assortment of dots in my edge detected image. Keeping in mind that my image is just a function of x and y, I can use the old familiar equation of a line y = mx + b.
In this case my model includes two parameters, m and b. In image space, a line is plotted as x versus y, but in parameter space, which we will call Hough space, I can represent that same line as m versus b instead. The Hough transform is just a conversion from image space to Hough space. So the characterization of a line in image space will be a single point at the position m-b in Hough space.
So our strategy to find lines in image space will be to look for intersecting lines in Hough space. We do this by dividing up our Hough space into a grid, and define intersecting lines as all lines passing through a given grid cell. To do this, I’ll first run the Canny edge detection algorithm to find all points associated with edges in my image. I can then consider every point in this edge-detected image as a line in Hough space.
And where many lines in Hough space intersect, I declare I have found a collection of points that describe a line in image space. We have a problem though, vertical lines have infinite slope in m-b representation, so we need a new parameterization. Let’s redefine our line in polar coordinates. Now the variable, ρ, describes the perpendicular distance of the line from the origin, and θ is the angle of the line away from horizontal.
Now each point in image space corresponds to a sine curve in Hough space. If we take a whole line of points, it translates into a whole bunch of sine curves in Hough space. And again, the intersection of those sine curves in θ-ρ space gives the parameterization of the line.
With this, we have come to the end of this article. Thanks for reading this and following along. Hope you loved it! Bundle of thanks for reading it!
My Linkedin :)