Updated Document Layout Analysis (markdown)

BobLd 2020-06-22 11:35:44 +01:00
parent 3974295f5b
commit 87e0c7463a

@ -37,8 +37,7 @@ In order to decide wether two glyphs are _close enough_ from each other, the alg
If the measured distance between the two glyphs is below this threshold, they are deemed to be connected.
Once glyphs are connected, they are then grouped to form words via a [depth first search algorithm](https://en.wikipedia.org/wiki/Depth-first_search).
It seems that both [left-to-right and right-to-left](https://en.wikipedia.org/wiki/Right-to-left) scripts have there glyph `StartBaseLine` on the left and `EndBaseLine` on the right.
Once glyphs are connected, they are then grouped to form words via a [depth first search algorithm](https://en.wikipedia.org/wiki/Depth-first_search). The extractor should work for [left-to-right and right-to-left](https://en.wikipedia.org/wiki/Right-to-left) scripts as it seems that they both have there glyphs' `StartBaseLine` on the left and `EndBaseLine` on the right of the bounding box.
### Usage
#### Simple case