Commit Graph

9 Commits

Author SHA1 Message Date
Zhiguan Hu
9baa8c3ca0 Fix format as suggested. 2019-12-11 10:05:17 -06:00
Zhiguan Hu
30247ba774 Fix the bug that happens when all the words in the current leaf for VerticalCut/HorizontalCut are all white spaces. 2019-12-10 15:03:05 -06:00
BobLd
d36dee0e25 Adding handling when pageWords count = 0 for IPageSegmenters 2019-09-04 22:14:08 +01:00
BobLd
eb9a9fd00e Document Layout Analysis - IPageSegmenter, Docstrum
- Create a TextBlock class
- Creates IPageSegmenter
- Add other useful distances: angle, etc.
- Update RecursiveXYCut
 - With IPageSegmenter and TextBlock
 - Make XYNode and XYLeaf internal
- Optimise (faster) NearestNeighbourWordExtractor and isolate the clustering algorithms for use outside of this class
- Implement a Docstrum inspired page segmentation algorithm
2019-08-10 16:01:27 +01:00
BobLd
5399456919 Making the RecursiveXYCut class static. 2019-08-09 18:50:20 +01:00
BobLd
00233fa5d0 Update with corrections - 2 2019-06-20 22:10:05 +01:00
BobLd
f8d0883da5 Update with corrections 2019-06-18 20:48:49 +01:00
BobLd
2525cd243f Typo correction 2019-06-16 14:03:12 +01:00
BobLd
a0c864e8af Addind Document Layout Analysis:
- Nearest Neighbour Word Extractor
- Recursive X-Y Cut algorithm, useful for multi-column pdf documents
2019-06-16 13:57:30 +01:00