mirror of
https://github.com/UglyToad/PdfPig.git
synced 2025-11-24 08:47:01 +08:00
Updated Document Layout Analysis (markdown)
@@ -5,7 +5,7 @@ In our case, we are using the pdf document itself instead of image representatio
|
||||
- [__Page segmenters__](https://github.com/UglyToad/PdfPig/wiki/Document-Layout-Analysis#page-segmenters)
|
||||
- [__Reading order detectors__](https://github.com/UglyToad/PdfPig/wiki/Document-Layout-Analysis#reading-order-detectors)
|
||||
- [__Other layout tools__](https://github.com/UglyToad/PdfPig/wiki/Document-Layout-Analysis#other-layout-tools)
|
||||
- [__Export__](https://github.com/UglyToad/PdfPig/wiki/Document-Layout-Analysis#export) – Tools to export and view the result of document layout analysis
|
||||
- [__Export__](https://github.com/UglyToad/PdfPig/wiki/Document-Layout-Analysis#export) – Viewing/exporting the results of document layout analysis
|
||||
|
||||
# Word extractors
|
||||
Word extractors deal with the task of building words using letters in a page. 2 different methods are currently available:
|
||||
@@ -565,6 +565,7 @@ Exporters deal with the task of transforming a pdf page into a text representati
|
||||
- [__PAGE__](https://github.com/UglyToad/PdfPig/wiki/Document-Layout-Analysis#page) (xml format)
|
||||
- [__ALTO__](https://github.com/UglyToad/PdfPig/wiki/Document-Layout-Analysis#alto) (xml format)
|
||||
- [__hOcr__](https://github.com/UglyToad/PdfPig/wiki/Document-Layout-Analysis#hocr) (html format)
|
||||
The exporters allow tools to display a visual representation of the layout and the images show in this page are examples of that.
|
||||
|
||||
## [PAGE](https://github.com/UglyToad/PdfPig/blob/master/src/UglyToad.PdfPig.DocumentLayoutAnalysis/Export/PageXmlTextExporter.cs)
|
||||
### Description
|
||||
|
||||
Reference in New Issue
Block a user