temporarily removed abstract

This commit is contained in:
sim
2024-02-06 21:40:02 +01:00
parent 267bb6aee8
commit c96fd9117e
-8
View File
@@ -1,12 +1,4 @@
\chapter{Abstract}
\begin{english}
Optical text recognition is becoming increasingly important in today's world and is used in many industries to efficiently extract textual information from photos and digital images. This bachelor thesis is dedicated to one of the application areas of optical text recognition, the recognition of text data in user-interface screenshots, and attempts to maximize the quantity and quality of the data obtained. For this purpose, different procedures for the preparation of the images, as well as the post-processing of the recognized text data are compared with each other and analyzed based on defined quality criteria.
The central question of the thesis aims to identify the best methodology for text recognition and optimize the results. The management of of COPA-DATA's product documentation will be simplified and at the same time a contribution to research in the field of text recognition in graphical user interfaces is being made.
In order to answer the central question, a selection of algorithms for image and text processing is made. The basic function of these algorithms is explained and the results of text recognition are examined using a sample. By applying common metrics for speech and text recognition, the respective algorithms are objectively compared with each other and entered into an automatically generated report. This report contains a detailed overview of all text recognition results and forms the basis for the evaluation.
The analysis of all result data in the report provides information, showing which algorithms deliver the best results in which scenarios. The greatest impact on the result data is the replacement of the thresholding or binarization method: If unsuitable parameters or methods are being used, only a fraction of the available text is recognized. If the appropriate method is selected on the other hand, the majority of the data is correctly recognized by the text recognition system.
For further research or adaptation to specific requirements, the prototypical implementation and the respective components can be reused. Thanks to the modular structure of the automatic comparison system, the ideal procedure for text recognition can always be determined with little effort, even after changing the display language or redesigning the color of the graphical user interface.
\end{english}