TTTThis

sudo apt install tesseract-ocr

or (although I don't think this is necessary)

sudo apt install tesseract-ocr libtesseract-dev tesseract-ocr-eng

Do an example. Name your file existingimage.png and open a Terminal in that folder and do

tesseract -l eng existingimage.png output_from_ocr cat documenttocreate.txt

(where -l specifies a language. To see all the languages, do man tesseract)

OCR means Optical Character Recognition

Convert image to pdf (not to txt)

tesseract -l eng input_for_ocr.png output_from_ocr pdf

Errors because 'Tesseract couldn't load any languages!': https://github.com/tesseract-ocr/tesseract/issues/1309

Spanish: download from here https://github.com/tesseract-ocr/tessdata/blob/c2b2e0df86272ce11be323f23f96cf656565ed41/spa.traineddata

put it here /usr/share/tesseract-ocr/4.00/tessdata/eng.traineddata (You will have to open that folder as root)

Or maybe you can just use: sudo apt-get install tesseract-ocr-spa (although it might not save it to the location you want)

NOTE: After you install a language (or even if you don't) you might over-save the same file, and see an error message, but it's working anyway.

Your email will not be published nor shared with anyone. In your text you can use markdown for marking up *italic*, links <http://example.org> and other elements. These comments are moderated and published manually as soon as possible.

Convert image to text (tesseract-ocr)

Comments: 0