Skip to Main Content

ABBYY FineReader Tutorial

A guide to using ABBYY FineReader for text extraction from documents using OCR (Optical Character Recognition).

Creating and Training a User Pattern

In training mode, a user pattern is created that can be used when performing OCR on the entire text. A user pattern is used when there are parts of the text that are unclear, fonts different from ABBYY's defaults, or special characters.

Note: "Pattern training is not supported for Asian languages" as per the ABBY website.

Step 1. Entering the Training Menu

To access options for training, navigate from the topmost main menu.

  1. Click Tools > Options and click on the OCR tab.
  2. Under the Use of patterns and training in OCR Editor section, choose the Use training to recognize new characters and ligatures option.
  3. Then click the Pattern Editor button.
  4. In the Pattern Editor dialog box select the NEW button to name your user pattern.
  5. Click OK in the Create Pattern dialog box then the Pattern Editor dialog box and then click OK in the Options dialog box to go back to the OCR Editor.

Here is a screenshot of the OCR Options Tab:

Screen capture of ABBYY main Options menu, showing how the OCR tab can be selected and from there configure options for user training on new fonts and language patterns.

NOTE: If you select the Also use built-in patterns option under the Use training to recognize new characters and ligatures, ABBYY will use its built-in patterns along with the user pattern you created which will lessen the amount of time you will have to spend on training.

Step 2. Initiate the Training Process

  1. In the toolbar above the image pane select Recognize Page which is a white sheet with a red letter A in a magnifying glass.
  2. During the recognition process, the Pattern Training dialog will open and ask you to input a character that matches the one shown in the dialog box.

Here is a screen shot of the Pattern Training pop up dialog box:

Screen capture of ABBYY popup window showing the training box. The trainer displays a single character and its bounding box, and asks the user to type in the computer character for the original character highlighted.

Adjust the bounding area as needed, and select effects if you wish to introduce such text features into your output. Once the boundary is set and you've inputted the correct corresponding character or letter, select train and proceed to the next level

Note: You don't need to train on the entire document. But you will need to keep going until you've provided sufficient exemplars for each character or letter in your document, often cited by OCR makers at 15 to 25 instances per character.