You are given 16 distinct color images, some containing receipts, and others not. The goal is to accurately classify these images into two categories: "receipt" or "non-receipt".
Receipt Identification Workflow
Segmenting Text
original receipt image next to segmented version of the same image
The main feature of receipt images is dark text on a bright background. To leverage this feature, we will separate the text from the background. Separating an image into distinct parts is calledĀ segmentingĀ an image.
A receipt image has a pattern of horizontal "stripes" of dark text separated by "stripes" of bright paper.
Identifying Text Patterns
binarized receipt next to a plot of its row sum. dips in the row sum align with text in the receipt.
Inspecting the pattern of text in the image we find that the rows of pixels that contain text have moreĀ 0Ā threshold values, and the rows between lines of text have moreĀ 1s. If you sum the values across each row, rows with text have smaller sums than rows without text.
Preprocessing and Postprocessing
Segmentation can be improved in two ways: by preprocessing the image before binarizing and by postprocessing the binary image itself.
Converting to grayscale
Color image of a receipt with produce in the background, next to the same image in grayscale.
Retain essential characteristics with little information lost.
When loaded into memory, a grayscale image occupies a third of the space required for an RGB image Because it has a third of the data, it requires less computational power to process and can reduce computation time.
Developing an image processing algorithm can be more straightforward when working with grayscale.
Adjusting the contrast
A grayscale image of a receipt next to a darker image of a receipt with low contrast
Expand the dynamic range to better differentiate text from the background.
Even though these two receipt images are grayscale, the contrast is different.
If you are analyzing a set of images, normalizing the brightness can be an important preprocessing step, especially for identifying the black-and-white patterns of text in receipt images.
Noise Removal Smooth pixel intensity values to reduce the impact of variation on binarization.
You can increase the light sensitivity of a digital camera sensor to improve the brightness of a picture taken in low light.
Many modern digital cameras automatically increase theĀ ISOĀ in dim light. However, this increase in sensitivity amplifies noise picked up by the sensor, leaving the image grainy.
Background Isolation and Subtraction
A morphological closing operation emphasizes the bright paper and removes the dark text.
Isolate and remove the background of an image before binarizing.
Receipt images that have a busy background are more difficult to classify because artifacts pollute the row sum. To mitigate this issue, you can isolate the background and then remove it by subtraction.
In a receipt image, the background is anything that is not text, so isolating the background can be interpreted as removing the text. One way to remove text from an image is to useĀ morphological operations.
Binary Morphology Emphasize particular patterns or shapes in a binary image.
Morphological operations are useful not only for removing features from an image but also for augmenting features. You can use morphology to enhance the text in the binary image and improve the row sum signal.
Morphological opening expands the dark text regions, while closing diminishes them. Increasing the size of the structuring element increases these effects.