An approach to a comprehensive test framework for analysis and evaluation of text line segmentation algorithms

Sensors (Basel). 2011;11(9):8782-812. doi: 10.3390/s110908782. Epub 2011 Sep 13.

Abstract

The paper introduces a testing framework for the evaluation and validation of text line segmentation algorithms. Text line segmentation represents the key action for correct optical character recognition. Many of the tests for the evaluation of text line segmentation algorithms deal with text databases as reference templates. Because of the mismatch, the reliable testing framework is required. Hence, a new approach to a comprehensive experimental framework for the evaluation of text line segmentation algorithms is proposed. It consists of synthetic multi-like text samples and real handwritten text as well. Although the tests are mutually independent, the results are cross-linked. The proposed method can be used for different types of scripts and languages. Furthermore, two different procedures for the evaluation of algorithm efficiency based on the obtained error type classification are proposed. The first is based on the segmentation line error description, while the second one incorporates well-known signal detection theory. Each of them has different capabilities and convenience, but they can be used as supplements to make the evaluation process efficient. Overall the proposed procedure based on the segmentation line error description has some advantages, characterized by five measures that describe measurement procedures.

Keywords: algorithms; document image processing; experiments framework; signal detection theory; testing; text line segmentation.

Publication types

  • Evaluation Study

MeSH terms

  • Algorithms*