Commit Graph

70 Commits

Author SHA1 Message Date
BobLd
3ac26bb1bc fix bbox for TextLine and TextBlock 2020-05-30 13:03:59 +01:00
BobLd
f883b56e72 completely rework DocstrumBoundingBoxes, now handle rotated text 2020-05-30 13:03:59 +01:00
BobLd
a16f377d5a update DefaultPageSegmenter to use DlaOptions 2020-05-30 13:03:59 +01:00
BobLd
1438fec741 update RecursiveXYCut to use DlaOptions 2020-05-30 13:03:59 +01:00
BobLd
5362a335f5 update XYLeaf with word separator 2020-05-30 13:03:59 +01:00
BobLd
79b78f486a add ReadingOrderHelper 2020-05-30 13:03:59 +01:00
BobLd
8f1ab2022f update NearestNeighbourWordExtractor to use DlaOptions, stop ordering words 2020-05-30 13:03:59 +01:00
BobLd
43a68693ba allow oriented bounding box for TextBlock 2020-05-30 13:03:59 +01:00
BobLd
5b0b0a6db3 allow oriented bounding box for TextLine 2020-05-30 13:03:59 +01:00
BobLd
bb94348127 add text Separator in TextBlock and TextLine 2020-05-30 13:03:59 +01:00
BobLd
5f75205e41 rename TextDirection into TextOrientation 2020-05-30 13:03:59 +01:00
BobLd
33ee66af42 add PageSegmenterOptions abstract class 2020-05-30 13:03:59 +01:00
BobLd
dd546dcfc8 update IPageSegmenter with DlaOptions 2020-05-30 13:03:59 +01:00
BobLd
3cf7c45994 add DlaOptions abstract class 2020-05-30 13:03:59 +01:00
BobLd
a00660cd6e update nn word extractor for new clustering type 2020-05-30 13:03:59 +01:00
BobLd
78da925263 add AlmostEqualsToZero() and AlmostEquals(). 2020-05-30 13:03:59 +01:00
BobLd
08300f6a3a use IReadOnlyList<PdfPoint> i/o PdfPoint[] in KdTree 2020-05-30 13:03:59 +01:00
BobLd
ca4111ec1b better parameters for FindIndexNearest 2020-05-30 13:03:59 +01:00
BobLd
d2c2a2f592 add angle bounding functions 2020-05-30 13:03:59 +01:00
BobLd
404d6621de return grouped elements i/o grouped indexes in Clustering.NearestNeighbours 2020-05-30 13:03:59 +01:00
Eliot Jones
256c2833ab 0.1.2-alpha002 2020-05-10 16:36:14 +01:00
BobLd
bb33741552 Fix KdTree.FindNearestNeighbours(k) returning the pivot itself 2020-05-10 15:49:00 +01:00
Eliot Jones
98dd736f94 0.1.2-alpha001 2020-04-25 15:20:07 +01:00
BobLd
ae82c30a31 Merge branch 'master' of https://github.com/UglyToad/PdfPig 2020-04-25 10:10:08 +01:00
BobLd
c2de52423e Make NearestNeighbours public 2020-04-25 08:40:43 +01:00
BobLd
d4210cd5d1 Make clustering algos public and use shorter names 2020-04-25 08:40:43 +01:00
BobLd
0a6ec3946b NearestNeighbourWordExtractor:
- Improve results by using PointSize
- Make 'filterFunction' public for ad hoc GetWords()
- Allow text in different direction

Make Letter.PointSize public and add warning (needed for NNWordExtractor)
Remove Page.GetPointSize(Letter letter)
2020-04-20 13:09:35 +01:00
Eliot Jones
407ee5ca51 add content order text extractor and example of use 2020-04-19 17:06:34 +01:00
BobLd
ab94746252 Improve default RecursiveXYCut dominant font width and height functions 2020-04-13 10:25:34 +01:00
Eliot Jones
75c04eb81c fix namespace using order 2020-04-05 17:58:57 +01:00
Eliot Jones
2a0a3fae69 add test for svg exporter and escape xml characters 2020-04-05 17:58:57 +01:00
Eliot Jones
2998b1b2eb tidy up clipper related code 2020-04-05 17:58:57 +01:00
BobLd
c14646fcc0 Fix Mode() 2020-04-05 17:58:57 +01:00
BobLd
a7fe39fc32 Add SvgTextExporter
Modify WriteSvg to account for height
2020-04-05 17:58:57 +01:00
BobLd
43b40da5d5 Change Subpath to path where necessary 2020-04-05 17:58:57 +01:00
BobLd
ab6a0f11fc Change name from PdfPath to PdfSubpath 2020-04-05 17:58:57 +01:00
Eliot Jones
9abe9f4b2f #158 add strong naming to the solution 2020-04-04 16:59:51 +01:00
BobLd
7d52bc8be4 make Distances.FindIndexNearest public
add tests for Distances.FindIndexNearest
2020-03-24 19:52:01 +00:00
BobLd
0d786a1265 Add tests for KdTree, MathExtensions and Distances
Add reference to DLA project
Make KdTree public
Fix mode computation for multimodal
2020-03-24 19:52:01 +00:00
Eliot Jones
4ed1600cab version 0.1.1 2020-03-18 20:10:51 +00:00
Eliot Jones
8ac4195b83 0.1.1-alpha001 2020-03-15 16:52:28 +00:00
BobLd
9366aa2b37 Tidy up code 2020-03-15 15:39:19 +01:00
BobLd
5b8a2f2e38 handle k-nearest neighbours search
update DocstrumBB with kd-tree
2020-03-15 15:39:19 +01:00
BobLd
8cafda3577 handle nearest neighbour not found 2020-03-15 15:39:19 +01:00
Eliot Jones
8df2f9cf6b generate all xml docs and pack them #148
after we split the solution into multiple projects the xml doc comments were no longer packed in the generated nuget package. in addition they were only generated for the net standard 2.0 target framework.

this change generates comments for all target frameworks and makes sure they're included in the generated package. it also adds missing doc comments where they weren't included on the public api and clears up a couple of minor formatting issues in the affected files.
2020-03-08 13:44:09 +00:00
BobLd
439dad9f35 fix depth for leaves 2020-03-06 16:53:18 +00:00
BobLd
dbdef7f195 kd-tree 2020-03-06 16:53:18 +00:00
BobLd
c4309ef31e fix kd-tree 2020-03-06 16:53:18 +00:00
BobLd
1e1a33d46e fix typo in kd-tree
replace Count() by Count or Length
2020-03-06 16:53:18 +00:00
BobLd
e477bc8d6d Tidy up and optimise kd-tree 2020-03-06 16:53:18 +00:00