Updated Document Layout Analysis (markdown)

changetocoding
2023-10-19 12:49:44 +01:00
parent 198f1f6131
commit 5fce77cff0

@@ -5,7 +5,7 @@ In our case, we are using the pdf document itself instead of image representatio
- [__Page segmenters__](https://github.com/UglyToad/PdfPig/wiki/Document-Layout-Analysis#page-segmenters)
- [__Reading order detectors__](https://github.com/UglyToad/PdfPig/wiki/Document-Layout-Analysis#reading-order-detectors)
- [__Other layout tools__](https://github.com/UglyToad/PdfPig/wiki/Document-Layout-Analysis#other-layout-tools)
- [__Export__](https://github.com/UglyToad/PdfPig/wiki/Document-Layout-Analysis#export) Tools to export and view the result of document layout analysis
- [__Export__](https://github.com/UglyToad/PdfPig/wiki/Document-Layout-Analysis#export) Viewing/exporting the results of document layout analysis
# Word extractors
Word extractors deal with the task of building words using letters in a page. 2 different methods are currently available:
@@ -565,6 +565,7 @@ Exporters deal with the task of transforming a pdf page into a text representati
- [__PAGE__](https://github.com/UglyToad/PdfPig/wiki/Document-Layout-Analysis#page) (xml format)
- [__ALTO__](https://github.com/UglyToad/PdfPig/wiki/Document-Layout-Analysis#alto) (xml format)
- [__hOcr__](https://github.com/UglyToad/PdfPig/wiki/Document-Layout-Analysis#hocr) (html format)
The exporters allow tools to display a visual representation of the layout and the images show in this page are examples of that.
## [PAGE](https://github.com/UglyToad/PdfPig/blob/master/src/UglyToad.PdfPig.DocumentLayoutAnalysis/Export/PageXmlTextExporter.cs)
### Description