Document extraction from pdfs and images with OpenCV.
MIT License
grayscale contrast
enhance color contrast
enhance color contrast and black & white
The enhance_color_contrast
function is designed to preprocess an image by enhancing its color and contrast to make text more visible while reducing the effects of bleed-through. After these enhancements, the image is converted to grayscale and its brightness is adjusted to achieve a clearer, more readable result.
def enhance_color_contrast(uploaded_image):
uploaded_image
:
brightened_image
:
Read the Image
file_bytes = np.asarray(bytearray(uploaded_image.read()), dtype=np.uint8)
image = cv2.imdecode(file_bytes, cv2.IMREAD_COLOR)
Check Image Validity
if image is None:
raise ValueError("Error: Unable to read the image. Please upload a valid image file.")
Convert Image to PIL Format
pil_image = Image.fromarray(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
Enhance Contrast
contrast_enhancer = ImageEnhance.Contrast(pil_image)
pil_image = contrast_enhancer.enhance(1.5)
Enhance Color Saturation
color_enhancer = ImageEnhance.Color(pil_image)
pil_image = color_enhancer.enhance(1.5)
Convert Back to OpenCV Format
enhanced_image = cv2.cvtColor(np.array(pil_image), cv2.COLOR_RGB2BGR)
Optional: Apply Mild Denoising
denoised_image = cv2.fastNlMeansDenoisingColored(enhanced_image, None, 10, 10, 7, 21)
Convert Back to PIL Format
final_pil_image = Image.fromarray(denoised_image)
Convert to Grayscale
grayscale_image = final_pil_image.convert('L')
Adjust Brightness
brightness_enhancer = ImageEnhance.Brightness(grayscale_image)
brightened_image = brightness_enhancer.enhance(1.2)
Return the Final Image
return brightened_image
# Assuming 'uploaded_file' is an image file object obtained from a file upload
processed_image = enhance_color_contrast(uploaded_file)
processed_image.show() # Displays the processed grayscale image with enhanced brightness
1.5
for contrast and saturation, and 1.2
for brightness.This function is particularly useful for preprocessing scanned documents, making text clearer and reducing the visibility of any bleed-through from the reverse side of the page.