Commit Graph

96 Commits

Author SHA1 Message Date
BobLd
c2de52423e Make NearestNeighbours public 2020-04-25 08:40:43 +01:00
BobLd
d4210cd5d1 Make clustering algos public and use shorter names 2020-04-25 08:40:43 +01:00
BobLd
0a6ec3946b NearestNeighbourWordExtractor:
- Improve results by using PointSize
- Make 'filterFunction' public for ad hoc GetWords()
- Allow text in different direction

Make Letter.PointSize public and add warning (needed for NNWordExtractor)
Remove Page.GetPointSize(Letter letter)
2020-04-20 13:09:35 +01:00
Eliot Jones
407ee5ca51 add content order text extractor and example of use 2020-04-19 17:06:34 +01:00
BobLd
ab94746252 Improve default RecursiveXYCut dominant font width and height functions 2020-04-13 10:25:34 +01:00
Eliot Jones
75c04eb81c fix namespace using order 2020-04-05 17:58:57 +01:00
Eliot Jones
2a0a3fae69 add test for svg exporter and escape xml characters 2020-04-05 17:58:57 +01:00
Eliot Jones
2998b1b2eb tidy up clipper related code 2020-04-05 17:58:57 +01:00
BobLd
c14646fcc0 Fix Mode() 2020-04-05 17:58:57 +01:00
BobLd
a7fe39fc32 Add SvgTextExporter
Modify WriteSvg to account for height
2020-04-05 17:58:57 +01:00
BobLd
43b40da5d5 Change Subpath to path where necessary 2020-04-05 17:58:57 +01:00
BobLd
ab6a0f11fc Change name from PdfPath to PdfSubpath 2020-04-05 17:58:57 +01:00
Eliot Jones
9abe9f4b2f #158 add strong naming to the solution 2020-04-04 16:59:51 +01:00
BobLd
7d52bc8be4 make Distances.FindIndexNearest public
add tests for Distances.FindIndexNearest
2020-03-24 19:52:01 +00:00
BobLd
0d786a1265 Add tests for KdTree, MathExtensions and Distances
Add reference to DLA project
Make KdTree public
Fix mode computation for multimodal
2020-03-24 19:52:01 +00:00
Eliot Jones
4ed1600cab version 0.1.1 2020-03-18 20:10:51 +00:00
Eliot Jones
8ac4195b83 0.1.1-alpha001 2020-03-15 16:52:28 +00:00
BobLd
9366aa2b37 Tidy up code 2020-03-15 15:39:19 +01:00
BobLd
5b8a2f2e38 handle k-nearest neighbours search
update DocstrumBB with kd-tree
2020-03-15 15:39:19 +01:00
BobLd
8cafda3577 handle nearest neighbour not found 2020-03-15 15:39:19 +01:00
Eliot Jones
8df2f9cf6b generate all xml docs and pack them #148
after we split the solution into multiple projects the xml doc comments were no longer packed in the generated nuget package. in addition they were only generated for the net standard 2.0 target framework.

this change generates comments for all target frameworks and makes sure they're included in the generated package. it also adds missing doc comments where they weren't included on the public api and clears up a couple of minor formatting issues in the affected files.
2020-03-08 13:44:09 +00:00
BobLd
439dad9f35 fix depth for leaves 2020-03-06 16:53:18 +00:00
BobLd
dbdef7f195 kd-tree 2020-03-06 16:53:18 +00:00
BobLd
c4309ef31e fix kd-tree 2020-03-06 16:53:18 +00:00
BobLd
1e1a33d46e fix typo in kd-tree
replace Count() by Count or Length
2020-03-06 16:53:18 +00:00
BobLd
e477bc8d6d Tidy up and optimise kd-tree 2020-03-06 16:53:18 +00:00
BobLd
5ae38f1bad Improve default max function 2020-02-28 14:43:18 +00:00
Eliot Jones
0fcc4e54c8 add istestproject setting to all projects
indicates which projects are test projects to the test runner.
2020-02-27 12:35:40 +00:00
BobLd
848d687910 Add kd tree and improve clustering 2020-02-19 11:12:54 +00:00
BobLd
3229574345 Correcting typo 2020-02-12 12:00:14 +00:00
BobLd
f0be3e99ee Add Projection class 2020-02-11 10:04:04 +00:00
BobLd
dd8baa0d2f Fix stack overflow error in RecursiveXYCut
Was linked to the new rotated rectangle methods
2020-02-10 13:53:59 +00:00
BobLd
0dad611cb1 Implement minimum bounding box algorithm 2020-01-31 16:24:59 +00:00
BobLd
380c36918b Remove unnecessary code 2020-01-31 16:24:59 +00:00
BobLd
0cbf3434bc Remove 'orderFunc' from 'NearestNeighbourWordExtractor' to use the order found by clustering algo 2020-01-31 16:24:59 +00:00
BobLd
bc69376743 Increase max distance for TextDirection.Other in NearestNeighbourWordExtractor 2020-01-31 16:24:59 +00:00
BobLd
a326d7e9d9 TextDirection.Unknown -> TextDirection.Other
Imporve NearestNeighbourWordExtractor for TextDirection.Other
2020-01-31 16:24:59 +00:00
BobLd
75821919a7 Fix NearestNeighbourWordExtractor for rotated text 2020-01-31 16:24:59 +00:00
BobLd
47672d3f90 Make TextBlock.SetReadingOrder(int) public 2020-01-13 09:25:57 +00:00
BobLd
e8216b29c5 Add reading order in PageXml export 2020-01-12 11:15:58 +00:00
BobLd
e7417be75a ReadingOrderDetector and tidying DLA project 2020-01-11 11:18:11 +00:00
Eliot Jones
e064d39671 remove unused project references from document layout analysis 2020-01-05 15:44:02 +00:00
Eliot Jones
e0a45e3774 include dependencies as dlls in the published nuget
by default nuget pack does not include project dependencies. this is suboptimal since it would require managing at least 5 nuget packages. this uses a workaround detailed here https://github.com/nuget/home/issues/3891 to copy the dependent dlls to the generated nuget package. this doesn't resolve the issue of how we publish the documentlayoutanalysis project, since it is the top of the dependency tree and we publish its parent, rather than it.
2020-01-05 13:56:14 +00:00
Eliot Jones
b29354e3e6 move compact font format fonts to fonts project 2020-01-05 12:08:01 +00:00
Eliot Jones
bbde38f656 move tokenizers to their own project
since both pdfs and Adobe Type1 fonts use postscript type objects, tokenization is needed by the main project and the fonts project
2020-01-05 10:40:44 +00:00
Eliot Jones
15525acbaa move document layout analysis and export to new project 2020-01-05 09:19:58 +00:00