Pytesseract Error Windows Error [error 2]
Solution 1:
I had the same trouble and quickly found the solution after reading this post:
OSError: [Errno 2] No such file or directory using pytesser
Just need to adapt it to Windows, replace the following code:
tesseract_cmd = 'tesseract'
with:
tesseract_cmd = 'C:\\Program Files (x86)\\Tesseract-OCR\\tesseract'
(need double \\
to escape first \
in the string)
Solution 2:
You're getting exception because subprocess isn't able to find the binaries (tesser executable).
The installation is a 3 step process:
1.Download/Install system level libs/binaries:
For various OS here's the help. For MacOS you can directly install it using brew.
Install Google Tesseract OCR (additional info how to install the engine on Linux, Mac OSX and Windows). You must be able to invoke the tesseract command as tesseract. If this isn’t the case, for example because tesseract isn’t in your PATH, you will have to change the “tesseract_cmd” variable at the top of tesseract.py. Under Debian/Ubuntu you can use the package tesseract-ocr. For Mac OS users. please install homebrew package tesseract.
For Windows:
An installer for the old version 3.02 is available for Windows from our download page. This includes the English training data. If you want to use another language, download the appropriate training data, unpack it using 7-zip, and copy the .traineddata file into the 'tessdata' directory, probably
C:\Program Files\Tesseract-OCR\tessdata
.To access tesseract-OCR from any location you may have to add the directory where the tesseract-OCR binaries are located to the Path variables, probably
C:\Program Files\Tesseract-OCR
.
Can download the.exe from here.
2.Install Python package
pip install pytesseract
3.Finally, you need to have tesseract binary in you PATH.
Or, you can set it at run-time:
import pytesseract
pytesseract.pytesseract.tesseract_cmd = '<path-to-tesseract-bin>'
For Windows:
pytesseract.pytesseract.tesseract_cmd = 'C:/Program Files (x86)/Tesseract-OCR/tesseract'
The above line will make it work temporarily, for permanent solution add the
tesseract.exe
to thePATH
- such asPATH=%PATH%;"C:\Program Files (x86)\Tesseract-OCR
".Beside that make sure that
TESSDATA_PREFIX
Windows environment variable is set to the directory, containing tessdata directory. For example:TESSDATA_PREFIX=C:\Program Files (x86)\Tesseract-OCR
i.e. tessdata location is: C:\Program Files (x86)\Tesseract-OCR\tessdata
Your example:
from PIL import Image
import pytesseract
pytesseract.pytesseract.tesseract_cmd = 'C:/Program Files (x86)/Tesseract-OCR/tesseract'print pytesseract.image_to_string(Image.open(r'D:\new_folder\img.png'))
Solution 3:
You need Tesseract OCR engine ("Tesseract.exe") installed in your machine. If the path is not configured in your machine, provide complete path in pytesseract.py(tesseract.py).
Install Google Tesseract OCR (additional info how to install the engine on Linux, Mac OSX and Windows). You must be able to invoke the tesseract command as tesseract. If this isn't the case, for example because tesseract isn't in your PATH, you will have to change the "tesseract_cmd" variable at the top of tesseract.py. Under Debian/Ubuntu you can use the package tesseract-ocr. For Mac OS users. please install homebrew package tesseract.
Solution 4:
I have also faced the same problem regarding pytesseract. I would suggest you to work in linux environment, to solve such errors. Do the following commands in linux:
pip install pytesseract
sudo apt-getupdate
sudo apt-get install pytesseract-ocr
Hope this will do the work..
Post a Comment for "Pytesseract Error Windows Error [error 2]"