Home  >  Article  >  Backend Development  >  python image text recognition

python image text recognition

高洛峰
高洛峰Original
2016-10-19 17:09:292505browse

I’ve been thinking recently that there is no tool for image text recognition? I thought of OCR, the relatively powerful Hanwang OCR in China. So can it be achieved with the help of python? So I searched and searched for information about PYthon's discussion in this area, and found such a fun program as PyTesser! Take it out and share it for discussion:

PyTesser is an optical character recognition module for Python. It is used in conjunction with the Tesseract OCR engine to extract and output a string from a picture or image file.

To use PyTesser, you do not need to install the Tesseract OCR engine, but you must first install the PIL module (Python Image Library, python graphics library)

Official introduction:

PyTesser is an Optical Character Recognition module for Python. It takes as input an image or image file and outputs a string.

PyTesser uses the Tesseract OCR engine, converting images to an accepted format and calling the Tesseract executable as an external script. A Windows executable is provided along with the Python scripts. The scripts should work in other operating systems as well.

PyTesser official download address: http://code.google.com/p/pytesser/downloads/list

PIL library resource address: http://www.pythonware.com/ products/pil/

However, during the test use, I found that it is ideal to only recognize English content, but cannot handle the recognition of Chinese content!

Interested students can try it


Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn