For assistance, please submit a request.
You can also reach us via the chat below, email data.services@nyu.edu, or join Discord server.
If you've met with us before, tell us how we're doing.
Stay in touch by signing up for our Data Services newsletter.
Bobst Library, 5th floor
Mondays: 12pm - 5pm
Tuesdays: 12pm - 5pm
Wednesdays: 12pm - 5pm
Thursdays: 12pm - 5pm
Fridays: 12pm - 5pm
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. Please attribute this work to the NYU Libraries Scholarly Communications and Information Policy Department.
Tesseract is an open source optical character recognition (OCR) platform. OCR extracts text from images and documents without a text layer and outputs the document into a new searchable text file, PDF, or most other popular formats. Tesseract is highly customizable and can operate using most languages, including multilingual documents and vertical text. Although the software can be used on Windows or Linux, this guide will be based on Mac operating systems which is done through the terminal application.