⚡Extracting the Machine Readable Zone (MRZ) from passport or any document images
AGPL-3.0 License
This repository extracts the Machine Readable Zone (MRZ) from document images. The MRZ typically contains important information such as the document holder's name, nationality, document number, date of birth, etc.
️Features:
Install Tesseract OCR engine. And set PATH
variable with the executable.
Install fastmrz
pip install fastmrz
This can be done through conda too if you prefer.
conda create -n fastmrz tesseract -c conda-forge
conda activate fastmrz
Copy mrz.traineddata
in tessdata
folder of repo to the tessdata folder in installed tesseract location
from fastmrz import FastMRZ
import json
fast_mrz = FastMRZ()
# Pass file path of installed Tesseract OCR, incase if not added to PATH variable
# fast_mrz = FastMRZ(tesseract_path=r'/opt/homebrew/Cellar/tesseract/5.3.4_1/bin/tesseract') # Default path in Mac
# fast_mrz = FastMRZ(tesseract_path=r'C:\\Program Files\\Tesseract-OCR\\tesseract.exe') # Default path in Windows
passport_mrz = fast_mrz.get_mrz("../data/passport_uk.jpg")
print("JSON:")
print(json.dumps(passport_mrz, indent=4))
print("\n")
passport_mrz = fast_mrz.get_mrz("../data/passport_uk.jpg", raw=True)
print("TEXT:")
print(passport_mrz)
OUTPUT:
JSON:
{
"mrz_type": "TD3",
"document_type": "P",
"country_code": "GBR",
"surname": "PUDARSAN",
"given_name": "HENERT",
"document_number": "707797979",
"nationality": "GBR",
"date_of_birth": "1995-05-20",
"sex": "M",
"date_of_expiry": "2017-04-22",
"status": "SUCCESS"
}
TEXT:
P<GBRPUDARSAN<<HENERT<<<<<<<<<<<<<<<<<<<<<<<
7077979792GBR9505209M1704224<<<<<<<<<<<<<<00
The standard for MRZ code is strictly regulated and has to comply with Doc 9303. Machine Readable Travel Documents published by the International Civil Aviation Organization.
There are currently several types of ICAO standard machine-readable zones, which vary in the number of lines and characters in each line:
Now, based on the example of a national passport, let us take a closer look at the MRZ composition.
wiki
pageDistributed under the AGPL-3.0 License. See LICENSE
for more information.
Give a ⭐️ if this project helped you!