Image Segmentation with Edge Detection

5 min readSep 20, 2020

Image segmentation is the process of partitioning images into sets of pixels. Pixels within the same set or “label” will share certain characteristics such as color, brightness, intensity, or texture. Many of the common applications of segmentation center around object detection and recognition. There are several techniques to perform image segmentation. Among the most common are clustering, which places pixels into the same category based on color & position, histogram methods which label pixels according to intensity, and edge detection which relies on edges to differentiate one object from another.

Using edges

Edges within an image always make for a great distinguishing feature because of their resilience to changes in lighting, color, and brightness. Furthermore, knowing the edges of an object reveals its shape, which is essential for object recognition.

So edges make for great features and are useful for applications of segmentation, but how do we find edges? Edges in images are easily defined as rapid changes in intensity. Therefore, an edge is wherever the derivative of intensity, with respect to space is above a certain, predefined threshold.

Image edges and corresponding intensity function <source>

Blurring the image

Often, edges within images are not as well defined as pictures above. Noise in the intensity function can wrack havoc on a detector that relies on the change in intensity being below the threshold. Therefore, an essential step before running an edge detector is smoothing. One of the most common techniques for smoothing is to convolve an image with a Gaussian function.

Gaussian function visualized (left) and values (right)

Let’s recall that convolution works by multiplying every element in the matrix with the value of the surrounding pixels in the image and returning the sum. Note that the largest value in a Gaussian matrix is always the center one and that all values should add to 1 to maintain the brightness in the image. Essentially, the operation has the effect of bringing in parts of the surrounding pixels’ intensity into each pixel (a blur). The blur reduces high frequency components within an image (like a low pass filter). And what’s left after the operation is a smoothed image.

Taking the Gradient

In order to detect edges, we want to know the change in intensity with respect to both the x and y directions. The gradient of an image is defined as

where the change in x direction can be found by

and change in y direction can be found similarly by using the same equation with consecutive points in the y direction.

The final magnitude of the gradient (the number which we will take use in our algorithm) takes into account the change in both the x and y directions.

What we are left with is a matrix of magnitudes. This magnitude captures the change in intensity in both the vertical and horizontal directions. Elements in the matrix that appear to be local maxima are likely to correspond to edges within the image.

Non max suppression

In practice, simply performing edge detection results in thick edges because the intensity likely exhibits a high rate of change over several pixels instead of one. In order to output thin edges, we perform non-maximum suppression. Essentially, for each pixel that is designated as an edge, we look at each surrounding pixel to note the gradient value. We keep a pixel marked as an edge only if it exhibits the highest rate of change in intensity in the surrounding area.

Linking

As can be seen in the image above, non max suppression thinned out the edges, but resulted in a rather disconnected figure. The final step to achieving edge detection with thin edges and a connected figure is linking. This involves finding edges with a lower threshold for the gradient. Edges found by this method are called weak edges, as opposed to the strong edges that are part of the image after non maximum suppression.

However, since weak edges are defined at a lower threshold, they contain noise and may not correlate to actual edges in the image. The final step of the process is to keep weak edges only if they are connected to strong edges. What’s left is in image with a connected figure, where most edges are marked and there is minimal noise.

Shortcomings

While the edge detector algorithm provides a simple and effective method to perform image segmentation, it has its weaknesses. Namely, in detecting objects with texture and when detecting edges in low contrast environments.

Consider how poorly the algorithm would perform on the trees in the image above where it will likely mark a single tree as several different objects only because of the texture on its surface. Next, consider the polar bear below.

Low contrast edges below polar bear <source>

The bottom edge of the polar bear occurs in extremely low contrast. An algorithm such as this edge detector which relies on changes in intensity would have a difficult time detecting this edge.

In order to address these challenges, we may look at edge detectors that do not depend on changes on contrast, but rather use other segmentation techniques such as clustering pixels into categories by position, color, or texture.