Feature Extraction #

Feature extraction identifies and represents key attributes or features of an image, such as shape, texture, intensity, and motion. Shape features might include area and perimeter, while texture features could involve patterns and smoothness within the image. These attributes enable detailed analysis and classification of image content. It transforms visual data into numerical representations, enabling tasks like classification, recognition, and analysis. This step is critical in computer vision and image processing for simplifying and analyzing image content.

Purpose and Techniques #

The goal is to transform segmented objects into compact representations that describe their primary characteristics. Features can be based on:

Intensity
Shape
Texture
Motion

Feature extraction often requires prior segmentation to isolate objects of interest.

Feature Vectors and Spaces #

A feature vector is an array encoding numerical or symbolic data about an object. For example, a symbolic feature could be a categorical label such as ‘circle’ or ‘square,’ representing the shape of an object. This representation resides in a multidimensional feature space, making it easier to analyze object relationships. For example:

A square can be described by its area and perimeter.
A feature space allows visualization and comparison between different objects.

Invariance and Robustness #

Feature design often includes invariance to:

Rotation: Recognition regardless of orientation.
Scaling: Identifying objects at different sizes.
Translation: Detecting objects regardless of position.

Invariance ensures robust performance across diverse conditions.

Binary Object Features #

In binary images, features include:

Area: The total number of pixels in an object.
Centroid: The object’s center of mass, calculated as: \( \bar{x} = \frac{1}{A} \sum_{x,y} x \cdot O(x, y), \quad \bar{y} = \frac{1}{A} \sum_{x,y} y \cdot O(x, y) \) where \(A\) is the object’s area, and \(O(x, y)\) indicates whether a pixel belongs to the object.
Moments: Quantify spatial distribution of pixel intensities. Moments are defined as: \( m_{pq} = \sum_{x=0}^{M-1} \sum_{y=0}^{N-1} x^p y^q f(x, y) \)
Central Moments: Translation-invariant moments defined as: \( \mu_{pq} = \sum_{x=0}^{M-1} \sum_{y=0}^{N-1} (x - \bar{x})^p (y - \bar{y})^q f(x, y) \)
Normalized Central Moments: Scale-invariant moments, calculated as: \( n_{pq} = \frac{\mu_{pq}}{\mu_{00}^\gamma}, \quad \gamma = \frac{p + q}{2} + 1 \)
Axis of Least Second Moment: Determines orientation of the object using the minimum spread of intensity.
Projections: Summarize distribution of pixels along specific axes.
Euler Number: Measures topology of the object: \( E = \text{Number of Objects} - \text{Number of Holes} \)
Perimeter: Total length of the object’s boundary.
Thinness Ratio: Indicates roundness: \( T_i = \frac{4\pi A_i}{P_i^2} \)
Eccentricity: Quantifies elongation: \( e = \frac{a}{b} \)
Aspect Ratio: Ratio of width to height: \( \text{Aspect Ratio} = \frac{\text{Width}}{\text{Height}} \)
Elongatedness: Measures how stretched an object is: \( \text{Elongatedness} = \frac{\text{Perimeter}^2}{4 \pi \cdot \text{Area}} \)

Histogram-Based Features #

Histograms provide statistical summaries of pixel intensities within an image or region. These features are simple yet powerful for analyzing intensity distributions. Common histogram-based features include:

Mean Intensity: The average pixel value: \( \text{Mean} = \frac{1}{N} \sum_{i=1}^{N} I_i \)
Standard Deviation: Measures the spread of pixel intensities: \( \sigma = \sqrt{\frac{1}{N} \sum_{i=1}^{N} (I_i - \text{Mean})^2} \)
Skewness: Indicates the asymmetry of the histogram.
Kurtosis: Describes the sharpness or flatness of the histogram.

These features are particularly useful for tasks like contrast analysis, thresholding, and classification.

Texture Features #

Texture features describe the spatial arrangement of pixel intensities, capturing patterns that represent surface properties. Common methods include:

Gray-Level Co-Occurrence Matrix (GLCM) #

GLCM captures how often pairs of pixel intensities occur at a specific distance and angle. For instance, in medical image analysis, GLCM is used to analyze tissue textures, aiding in the identification of abnormalities like tumors or fibrosis. Features derived from GLCM include:

Contrast: Measures the intensity difference between pixels: \( \text{Contrast} = \sum_{i,j} |i - j|^2 \cdot P(i,j) \)
Correlation: Indicates how correlated pixel intensities are: \( \text{Correlation} = \frac{\sum_{i,j} (i \cdot j \cdot P(i,j)) - \mu_i \mu_j}{\sigma_i \sigma_j} \)
Energy: Measures uniformity: \( \text{Energy} = \sum_{i,j} P(i,j)^2 \)
Homogeneity: Captures the closeness of pixel values: \( \text{Homogeneity} = \sum_{i,j} \frac{P(i,j)}{1 + |i - j|} \)

Local Binary Patterns (LBP) #

LBP encodes texture by thresholding a pixel’s neighborhood. A binary pattern is generated, representing the relationship between a pixel and its neighbors. LBP is rotation-invariant and robust to illumination changes.

Gabor Filters #

Gabor filters extract texture by analyzing frequency and orientation. A Gabor filter is defined as: \(G(x, y; \lambda, \theta, \psi, \sigma, \gamma) = \exp\left(-\frac{x'^2 + \gamma^2 y'^2}{2 \sigma^2}\right) \cos\left(2 \pi \frac{x'}{\lambda} + \psi\right)\) Where \(x'\) and \(y'\) are rotated coordinates, \(\lambda\) is the wavelength, \(\theta\) is the orientation, and \(\gamma\) controls the scale.

Boundary Descriptors (Contour-Based) #

Boundary descriptors focus on the outline of an object, providing insights into its shape. For example, in character recognition, boundary descriptors can help identify letters by analyzing the unique contours and edges of each character. These include:

Perimeter: Total length of the boundary.
Curvature: How sharply the boundary changes direction.
Signatures: One-dimensional representations of the boundary, such as radial distance from the centroid.

Signatures #

A signature is a one-dimensional representation of the boundary, commonly plotting radial distance from the centroid: \(R(\theta) = \sqrt{(x - \bar{x})^2 + (y - \bar{y})^2}\) Where \(\theta\) is the angle, and \(x, y\) are boundary points.