Layout Analysis in textual information with NLP

چکیده مقاله:

As the number of scientific journals increases, analyzing trends and the latest technologies in a particular scientificfield turns into a very time consuming and tedious task. In response to the urgent need for information, which theexisting systematic review model does not make good use of, several review types have emerged, namely, quickreview and investigation of the limits. In this paper, we propose an NLP-enabled tool that automates most of the textdocument review process with automated analysis. On the other hand, the two main purposes of OCR are to recognizetext from images and to transform images into text. Currently, one of the tasks performed by OCR is layout analysis,which classifies text images. In fact, in layout analysis, we put the different parts of a text image, including tables,headings, paragraphs, etc., into separate classes; for this purpose, we have two general methods, which are: ۱- Visioncomputer method ۲- Natural language processing method. In this study, we have used the second method, which wewill examine in detail in this study. Natural language processing method applied in this paper gives us an accuracyof ۰.۷۴ in the evaluation section in textual information, which is significant and can be relied on as a result

Mohammadreza Faraji

B.Sc. in Computer of Engineering, Fouman Faculty of Engineering, College of Engineering, University of Tehran, Iran

Atefeh Hasan-Zadeh

Fouman Faculty of Engineering, College of Engineering, University of Tehran, Iran, P.O.Box: ۴۳۵۸۱-۳۹۱۱۵,