mirror of
https://github.com/UglyToad/PdfPig.git
synced 2026-03-10 00:23:29 +08:00
Updated Document Layout Analysis (markdown)
@@ -558,7 +558,9 @@ Exporters deal with the task of transforming a pdf page into a text representati
|
||||
### Description
|
||||
The PAGE (Page Analysis and Ground-Truth Elements) Format is an _XML-based page image representation framework that records information on image characteristics (image borders, geometric distortions and corresponding corrections, binarisation etc.) in addition to layout structure and page content._ – [PRImA Research Lab](https://www.primaresearch.org/publications/ICPR2010_Pletschacher_PAGE)
|
||||
|
||||
You can use PRImA Research Lab's [LayoutEvalGUI](https://www.primaresearch.org/tools/PerformanceEvaluation) software to open the exported file.
|
||||
You can use one of the following tools from PRImA Research Lab's to open the exported file:
|
||||
- [LayoutEvalGUI](https://www.primaresearch.org/tools/PerformanceEvaluation)
|
||||
- [PAGEViewer](https://www.primaresearch.org/tools/PAGEViewer) (This might be easier for beginners)
|
||||
|
||||
- __The PAGE xml exporter is the only exporter supporting reading order (as of Jan. 2020).__ See [here](https://github.com/UglyToad/PdfPig/wiki/Document-Layout-Analysis/#result-2) for an example.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user