Document Layout Analysis - IPageSegmenter, Docstrum

- Create a TextBlock class
- Creates IPageSegmenter
- Add other useful distances: angle, etc.
- Update RecursiveXYCut
 - With IPageSegmenter and TextBlock
 - Make XYNode and XYLeaf internal
- Optimise (faster) NearestNeighbourWordExtractor and isolate the clustering algorithms for use outside of this class
- Implement a Docstrum inspired page segmentation algorithm
This commit is contained in:
BobLd
2019-08-10 16:01:27 +01:00
parent 2d6e49426a
commit eb9a9fd00e
10 changed files with 544 additions and 119 deletions

View File

@@ -8,7 +8,7 @@ namespace UglyToad.PdfPig.DocumentLayoutAnalysis
/// <summary>
/// A Node used in the <see cref="RecursiveXYCut"/> algorithm.
/// </summary>
public class XYNode
internal class XYNode
{
/// <summary>
/// Returns true if this node is a leaf, false otherwise.