FIRe-text is not OCR software. It was not developed to perform layout analysis or extract text from digital images. It is a utility that can be used to manually edit text that has already been processed via OCR software. As many of you have probably discovered OCR software is not an exact science. There are frequent formatting and spelling errors in the extracted text that often need to be manually corrected. After processing an image using OCR software I previously found myself working with 2 windows open, one displaying the image while the other displayed the text. This made it incredibly difficult to keep track of which line I was currently working on since my eyes were constantly switching from one window to the other. I knew there had to be a better way to edit these text files so I ended up developing this extension.

The FIRe-text (pronounced “fire text”) extension was developed in an attempt to make the process of converting text in digital images into plain text somewhat simpler. The extension tries to accomplish that by focusing on one line of text at a time. One of the main difficulties of editing an e-text is that you often lose track of which line you're on in the image. To help avoid losing your place in the image the extension displays a floating textbox over the image directly below the selected line of text. It can also be manually positioned anywhere over the image.

While the goal of this project was to eventually make use of the extension to transform entire books into e-texts it's not at all limited to that specific purpose. Even if you only have a single image/text that you need to edit you can still use this extension to accomplish that task. However if you do have multiple images/texts the extension will walk you through them one file at a time allowing you to simply click a next button when the current page is completed. For example, if you have scanned a number of pages from a book and numbered them sequentially (001.png, 002.png, 003.png) and already processed them using OCR (001.txt, 002.txt, 003.txt) then you only need to load the folder containing those files into the extension and it will take care of the rest enabling you to easily edit them.

FIRe-text is a multipurpose utility that can be used to serve many different functions. It's uses range from editing OCR'd text to creating e-texts from scratch (simply create and save a blank text file and load the folder containing that file into the extension). The extension can even be used as an image viewer enabling the user to load a folder containing a series of images and then displaying them one at a time in the browser.

For more info and complete detailed instructions on how to use this extension please see the User Guide. It's accessible from the toolbar (Alt + Ctrl + E). The extension also includes a detailed guide on how to process images using free and open source OCR software.

Download files:


This page is part of the LegacyCollector website.
Disclaimer: All material on this site is property of their respective owners and available under
open licenses to the best of our knowledge. If you are an author and would like anything removed,
then please write an e-mail to legacy [at] collector dot org.