10.7 Local Binary Patterns

Local Binary Patterns, or LBPs for short, are a texture descriptor first introduced by Ojala et al. in their 2002 paper, Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns.

Unlike Haralick texture features that compute a global representation of texture based on the Gray Level Co-occurrence matrix, LBPs instead compute a local representation of texture. This local representation is performed by comparing each pixel with its surrounding neighborhood of pixel values.

LBPs are substantially more powerful than Haralick texture features; however, the increased discriminative ability comes at a perhaps computationally prohibitive cost and potentially explosive feature vector size, depending on the application (and the type of LBP method used).

Perhaps most notably, LBPs have been successfully used to perform robust face recognition, as demonstrated by Ahonen et al., in their paper Face recognition with local binary patterns.

LBPs are implemented in both mahotas and scikit-image. Both these implementations work well; however, I prefer the scikit-image implementation which is (1) easier to use and (2) implements recent extensions to LBPs which further improves rotation invariance, leading to higher accuracy and smaller feature vector sizes.

What are Local Binary Patterns used to describe?

Local Binary Patterns are used to characterize the texture and pattern of an image/object in an image. However, unlike Haralick texture features, LBPs process pixels locally which leads to a more robust, powerful texture descriptor.

How do Local Binary Patterns work?

LBPs compute a local representation of texture by comparing each pixel with its surrounding neighborhood.

The first step in constructing the LBP texture descriptor is to convert the image to grayscale. For each pixel in the grayscale image, we select a neighborhood of size r surrounding the center pixel. A LBP value is then calculated for this center pixel and stored in an output 2D array with the same width and height as our input image.

For example let’s take a look at the original LBP descriptor, which operates on a fixed 3 x 3 neighborhood of pixels just like this:

In the above figure we take the center pixel and threshold it against its neighborhood of 8 pixels. If the intensity of the center pixel is greater-than-or-equal to its neighbor, then we set the value to 1; otherwise, we set it to 0.

With 8 surrounding pixels, we have a total of 2 8 = 256 possible combinations of LBP codes.

From there, we need to calculate the LBP value for the center pixel. We can start from any neighboring pixel and work our way clockwise or counter-clockwise, but our ordering must be consistent for all pixels in our image and all images in our dataset.

Given a 3 x 3 neighborhood, we thus have 8 neighbors that we must perform a binary test on. The results of this binary test are stored in an 8-bit binary array, which we then convert to decimal, like this:

In this example we start at the top-right point and work our way clockwise accumulating the binary string as we go along. We can then convert this binary string to decimal, yielding of a value of 23.

This value is stored in the output LBP 2D array, which we can then visualize below:

Here’s an example of computing and visualizing a full LBP 2D array:

The last step is to compute a histogram over the output LBP array. Since a 3 x 3 neighborhood has 2 8 = 256 possible binary patterns, our LBP 2D array thus has a minimum value of 0 and a maximum value of 255, allowing us to construct a 256-bin histogram of LBP codes as our feature vector:

There are 2 primary benefits of this original LBP algorithm proposed by Ojala et al.

The first benefit is that examining the simple 3 x 3 neighborhood is extremely fast and efficient — it only requires a simple thresholding test and very quick bit operations.
The second benefit is working at such a small scale we are able to capture extremely fine grained details in the image.

Problem : However, being able to capture details at a small scale also is the biggest drawback of the algorithm — we cannot capture details at varying scales, only the fixed 3 x 3 scale!

To handle this, an extension to the original LBP implementation was proposed to handle variable neighborhood sizes. To account for variable neighborhood sizes, 2 parameters were introduced:

The number of points p in a circularly symmetric neighborhood to consider (thus removing relying on a square neighborhood).
The radius of the circle r, which allows us to account for different scales.

Using this approach the LBP descriptor now has (theoretically) no limitations on the size of the neighborhood r or the number of points p. However, there is a substantially computational cost as the size of both r and p increase.

Given circular radius r and number of points p, the goal is now to align the neighbors of each pixel on the circle to capture the following neighbors called “prototypes”, which are the unique set of possible binary patterns that can occur for a given radius r and number of points p.

Below we can see the 36 unique rotation invariant binary patterns that can occur in a neighborhood of p=8 points, where black circles have a bit value of 0 and white circles have a bit value of 1:

The LBP pattern marked as 0 detects bright regions in an image (since it is surrounded by pixel intensity pixels that are smaller than itself). The LBP pattern marked 8 detects dark spots in the image, since all pixels surrounding it are larger. Finally, the pattern marked 4 detects edge regions in the image where there is a transition from dark to light.

Furthermore, the top row of above figure demonstrates another important extension to Local Binary Patterns: uniformity.

A LBP is considered to be uniform if it has at most two 0-1 or 1-0 transitions. For example, the pattern 00001000 (2 transitions) and 10000000 (1 transition) are both considered uniform patterns since they contain at most two 0-1 to 1-0 transitions. 01010010 (6 transitions) on the other hand is not considered a uniform pattern since it has six 0-1 or 1-0 transitions.

The number of uniform prototypes in a Local Binary Pattern is completely dependent on the number of points p. As the value of p increases, so will the dimensionality of your resulting histogram.

For the time being simply keep in mind that given the number of points p in the LBP, there are p + 1uniform patterns. The final dimensionality of the histogram is thus p + 2, where the added entry tabulates all patterns that are not uniform.

It’s also important to keep in mind the effect of both the radius r and the number of points p.

The more points p you sample, the more patterns you can encode, but at the same time you increase your computational cost.
On the other hand, if you increase your radius size r then you can capture larger texture details in the image.
However, if you increase r without increasing p as well, then you’ll lose the locally discriminative power of the LBP descriptor. In general, you’ll want to increase or decrease both r and p together.

Finally, it’s important to consider the spatial information of the LBP. If we took all LBP codes and constructed a histogram of them we would lose all spatial information, similar to constructing a color histogram:

On the other hand, if we divide our image into blocks, extract LBPs for each block, and concatenate them together, we are able to create a descriptor that encodes spatial information:

The spatial encoding step is certainly not necessary, but for tasks such as face recognition it’s crucial. We’ll utilize this spatial encoding more when we explore face recognition with LBPs.

How do I use Local Binary Patterns?

You can use Local Binary Patterns by using either the scikit-image or mahotas packages.

Below follows an example using scikit-image:

# import the necessary packages from skimage import feature import numpy as np # define the parameters of the Local Binary Patterns numPoints = 24 radius = 3 # extract the histogram of Local Binary Patterns lbp = feature.local_binary_pattern(gray, numPoints, radius, method="uniform") (hist, _) = np.histogram(lbp.ravel(), bins=range(0, numPoints + 3), range=(0, numPoints + 2)) # optionally normalize the histogram eps = 1e-7 hist = hist.astype("float") hist /= (hist.sum() + eps)

And this example details how to use mahotas to extract LBPs:

import mahotas hist = mahotas.features.lbp(gray, radius, points)

In general I recommend using the scikit-image implementation of LBPs as they offer more control of the types of LBP histograms you want to generate. Furthermore, the scikit-image implementation also includes variants of LBPs that improve rotation and grayscale invariance.

Building a mini fashion search engine using texture

So imagine this:

You see a piece of clothing you like, whether in a department store, in a magazine, or even a person walking down the street.
You pull out your smartphone and snap a photo of the piece of clothing.
Your phone takes this image, analyzes it, and finds similar items of clothing (at a cheaper price) online.

We are going to tackle a small part of this problem — how to rank pieces of clothing for similarity based on texture.

The first thing we’ll need for this demo is a dataset of clothing. To gather this dataset, I went to Nordstrom.com and gathered 10 images of men’s dress shirts, which you can see below:

In this dataset we have 4 plain, uniformly textured shirts followed by 6 stripe/checkerboard pattern shirts. We’ll be using LBPs to discriminate amongst these two types of shirts.

I also snapped a few photos of my own clothing so that we can use to query our mini fashion search engine:

The goal here is to submit each of the query images to our fashion search engine, have our image search engine rank the images using Local Binary Patterns, and then return the most similar images based on the texture/pattern like this:

Let’s go ahead and get this demo started by defining the directory structure for our project:

We’ll be creating a pyimagesearch module to help keep our code organized. And within the pyimagesearch module we’ll create a descriptors sub-module where our Local Binary Patterns implementation will be stored.

We start by importing the feature sub-module of scikit-image which contains the implementation of the Local Binary Patterns descriptor.

Our constructor of the LBP descriptor takes 2 parameters: the radius of the pattern surrounding the central pixel, along with the number of points along the outer radius.

From there, we define our describe method, which accepts a single required argument — the image we want to extract LBPs from.

The actual LBP computation is handled on Line 14 using our supplied number of points and radius. The uniform method indicates that we are computing the rotation invariant form of LBPs.

# import the necessary packages from skimage import feature import numpy as np class LocalBinaryPatterns: def __init__(self, numPoints, radius): # store the number of points and radius self.numPoints = numPoints self.radius = radius def describe(self, image, eps=1e-7): # compute the Local Binary Pattern representation of the image, and then # use the LBP representation to build the histogram of patterns lbp = feature.local_binary_pattern(image, self.numPoints, self.radius, method="uniform") (hist, _) = np.histogram(lbp.ravel(), bins=range(0, self.numPoints + 3), range=(0, self.numPoints + 2)) # normalize the histogram hist = hist.astype("float") hist /= (hist.sum() + eps) # return the histogram of Local Binary Patterns return hist

However, the lbp variable returned by the local_binary_pattern function is not directly usable as a feature vector. Instead, lbp is a 2D array with the same width and height as our input image — each of the values inside lbp ranges from [0, 25], a value for each of the 25 possible rotation invariant LBP prototypes, along with an extra dimension for all patterns that are not uniform, yielding a total of 26 unique possible values.

Thus, to construct the actual feature vector, we need to make a call to np.histogram which counts the number of times each of the LBP prototypes appears. The returned histogram is 26-d, an integer count for each of prototypes. We then take this histogram, normalize it such that it sums to 1, and then return it to our calling function.

Now that our LocalBinaryPatterns descriptor is defined, let’s see how we can use it to build a mini fashion search engine: [search_shirts.py]

We start off by importing our packages and parsing our command line arguments. We’ll require two switches, the –dataset which is the path to our directory of shirt images along with the –query which is our query image we are submitting to the mini fashion search engine.

We’ll then initialize our LocalBinaryPatterns descriptor with radius=8 and numPoints=24

# import the necessary packages from __future__ import print_function from pyimagesearch import LocalBinaryPatterns from imutils import paths import numpy as np import argparse import cv2 # construct the argument parse and parse the arguments ap = argparse.ArgumentParser() ap.add_argument("-d", "--dataset", required=True, help="path to the dataset of shirt images") ap.add_argument("-q", "--query", required=True, help="path to the query image") args = vars(ap.parse_args()) # initialize the local binary patterns descriptor and initialize the index dictionary # where the image filename is the key and the features are the value desc = LocalBinaryPatterns(24, 8) index = <> # loop over the shirt images for imagePath in paths.list_images(args["dataset"]): # load the image, convert it to grayscale, and describe it image = cv2.imread(imagePath) gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) hist = desc.describe(gray) # update the index dictionary filename = imagePath[imagePath.rfind("/") + 1:] index[filename] = hist # load the query image and extract Local Binary Patterns from it query = cv2.imread(args["query"]) queryFeatures = desc.describe(cv2.cvtColor(query, cv2.COLOR_BGR2GRAY)) # show the query image and initialize the results dictionary cv2.imshow("Query", query) results = <> # loop over the index for (k, features) in index.items(): # compute the chi-squared distance between the current features and the query # features, then update the dictionary of results d = 0.5 * np.sum(((features - queryFeatures) ** 2) / (features + queryFeatures + 1e-10)) results[k] = d # sort the results results = sorted([(v, k) for (k, v) in results.items()])[:3] # loop over the results for (i, (score, filename)) in enumerate(results): # show the result image print("#%d. %s: %.4f" % (i + 1, filename, score)) image = cv2.imread(args["dataset"] + "/" + filename) cv2.imshow("Result #<>".format(i + 1), image) cv2.waitKey(0)

Line 18 is important, although very subtle in the grand-scheme of the project. Here we define a dictionary called index , where the key to the dictionary is the unique shirt image filename and the value is the extracted LBPs. We’ll be using this dictionary to store our extracted feature and aid us in comparing the query image to our dataset.

From there, Lines 21-29 handle extracting LBPs from our 10 image shirt dataset. We simply loop over the images, extract the LBPs, and update the index dictionary.

We load our query image (i.e. the image that will be submitted to our system) and extract a LBP histogram from the query image on Lines 32 and 33.

The actual “search” takes place on Lines 40-44, where we loop over the feature vectors in our dataset, compare them to the query features using the X 2 distance, and then update our results dictionary.

We sort our results on Line 47 (where smaller distances indicate higher similarity), keeping the 3 most similar results.

Finally, Lines 50-55 display the results of our search to our screen.

Let’s go ahead and submit query_01.jpg to our mini fashion search engine using the following command:

$ python search_shirts.py --dataset shirts --query queries/query_01.jpg

Notice how the top 3 results all have the same texture/pattern as the query image. While their colors are different, the actual pattern is the same: simple and plain.

However, notice what happens when we submit query_02.jpg to the system:

$ python search_shirts.py --dataset shirts --query queries/query_02.jpg

Here we have a striped, checkerboard shirt submitted to our mini fashion search engine. Notice how our results are different this time — instead of our results containing the plain, simple shirts, our results now include shirts with a similar pattern as the input image! And again, all of this was accomplished by quantifying the images using LBPs and then comparing them using the X 2 distance.

Admittedly, this example is quite small (only 10 images), but the same principles apply even as our image dataset increases to thousands of images — we can use LBPs to characterize the texture and patterns of an image, and subsequently rank them for similarity based on these feature vectors.

*** A really great extension to this demo would be to incorporate color to the mini fashion search engine as well. Right now we can only search based on texture — but we could extend it to work with color as well! If you’re interested in extending this demo, here are some suggestions to get you started:

Ranking the images first on texture, and then re-ranking on color.
Fusing both color and texture into a single feature vector.
Performing a search on both color and texture independently, and then merging the results together.

Suggestions when using Local Binary Patterns

The main point to realize when utilizing local binary patterns is that the radius and number of points has a dramatic effect on (1) the dimensionality of your feature vector and (2) computational efficiency — provided you are not using the rotation invariant uniform implementation of LBPs, in which case your feature vector size will remain fixed at 25-d, but your computation times can increase.

Furthermore, the larger your radius and number of points to consider are, the slower your extraction will be. At points, this extraction becomes prohibitively slow, so take care when using LBPs.

*** Personally, I always start off with p=8 and r=1.0 and perhaps work my way up to p=24 and r=3.0, increasing the radius further to see if my accuracy improves. I also tend to use rotation invariant LBPs whenever possible as they have substantially smaller feature vector size and are easier to compute a histogram for.

Pros:

The original implementation of LBPs is not rotationally invariant, but there extensions to LBPs implemented in scikit-image are.
Very good at characterizing the texture of an image.
Useful in face recognition.

Cons:

Can easily lead to large feature vectors if not careful.
Computationally prohibitive as the number of points and radius increases.