BobLd
c2de52423e
Make NearestNeighbours public
2020-04-25 08:40:43 +01:00
BobLd
d4210cd5d1
Make clustering algos public and use shorter names
2020-04-25 08:40:43 +01:00
BobLd
0a6ec3946b
NearestNeighbourWordExtractor:
...
- Improve results by using PointSize
- Make 'filterFunction' public for ad hoc GetWords()
- Allow text in different direction
Make Letter.PointSize public and add warning (needed for NNWordExtractor)
Remove Page.GetPointSize(Letter letter)
2020-04-20 13:09:35 +01:00
Eliot Jones
407ee5ca51
add content order text extractor and example of use
2020-04-19 17:06:34 +01:00
BobLd
ab94746252
Improve default RecursiveXYCut dominant font width and height functions
2020-04-13 10:25:34 +01:00
Eliot Jones
75c04eb81c
fix namespace using order
2020-04-05 17:58:57 +01:00
Eliot Jones
2a0a3fae69
add test for svg exporter and escape xml characters
2020-04-05 17:58:57 +01:00
Eliot Jones
2998b1b2eb
tidy up clipper related code
2020-04-05 17:58:57 +01:00
BobLd
c14646fcc0
Fix Mode()
2020-04-05 17:58:57 +01:00
BobLd
a7fe39fc32
Add SvgTextExporter
...
Modify WriteSvg to account for height
2020-04-05 17:58:57 +01:00
BobLd
43b40da5d5
Change Subpath to path where necessary
2020-04-05 17:58:57 +01:00
BobLd
ab6a0f11fc
Change name from PdfPath to PdfSubpath
2020-04-05 17:58:57 +01:00
Eliot Jones
9abe9f4b2f
#158 add strong naming to the solution
2020-04-04 16:59:51 +01:00
BobLd
7d52bc8be4
make Distances.FindIndexNearest public
...
add tests for Distances.FindIndexNearest
2020-03-24 19:52:01 +00:00
BobLd
0d786a1265
Add tests for KdTree, MathExtensions and Distances
...
Add reference to DLA project
Make KdTree public
Fix mode computation for multimodal
2020-03-24 19:52:01 +00:00
Eliot Jones
4ed1600cab
version 0.1.1
2020-03-18 20:10:51 +00:00
Eliot Jones
8ac4195b83
0.1.1-alpha001
2020-03-15 16:52:28 +00:00
BobLd
9366aa2b37
Tidy up code
2020-03-15 15:39:19 +01:00
BobLd
5b8a2f2e38
handle k-nearest neighbours search
...
update DocstrumBB with kd-tree
2020-03-15 15:39:19 +01:00
BobLd
8cafda3577
handle nearest neighbour not found
2020-03-15 15:39:19 +01:00
Eliot Jones
8df2f9cf6b
generate all xml docs and pack them #148
...
after we split the solution into multiple projects the xml doc comments were no longer packed in the generated nuget package. in addition they were only generated for the net standard 2.0 target framework.
this change generates comments for all target frameworks and makes sure they're included in the generated package. it also adds missing doc comments where they weren't included on the public api and clears up a couple of minor formatting issues in the affected files.
2020-03-08 13:44:09 +00:00
BobLd
439dad9f35
fix depth for leaves
2020-03-06 16:53:18 +00:00
BobLd
dbdef7f195
kd-tree
2020-03-06 16:53:18 +00:00
BobLd
c4309ef31e
fix kd-tree
2020-03-06 16:53:18 +00:00
BobLd
1e1a33d46e
fix typo in kd-tree
...
replace Count() by Count or Length
2020-03-06 16:53:18 +00:00
BobLd
e477bc8d6d
Tidy up and optimise kd-tree
2020-03-06 16:53:18 +00:00
BobLd
5ae38f1bad
Improve default max function
2020-02-28 14:43:18 +00:00
Eliot Jones
0fcc4e54c8
add istestproject setting to all projects
...
indicates which projects are test projects to the test runner.
2020-02-27 12:35:40 +00:00
BobLd
848d687910
Add kd tree and improve clustering
2020-02-19 11:12:54 +00:00
BobLd
3229574345
Correcting typo
2020-02-12 12:00:14 +00:00
BobLd
f0be3e99ee
Add Projection class
2020-02-11 10:04:04 +00:00
BobLd
dd8baa0d2f
Fix stack overflow error in RecursiveXYCut
...
Was linked to the new rotated rectangle methods
2020-02-10 13:53:59 +00:00
BobLd
0dad611cb1
Implement minimum bounding box algorithm
2020-01-31 16:24:59 +00:00
BobLd
380c36918b
Remove unnecessary code
2020-01-31 16:24:59 +00:00
BobLd
0cbf3434bc
Remove 'orderFunc' from 'NearestNeighbourWordExtractor' to use the order found by clustering algo
2020-01-31 16:24:59 +00:00
BobLd
bc69376743
Increase max distance for TextDirection.Other in NearestNeighbourWordExtractor
2020-01-31 16:24:59 +00:00
BobLd
a326d7e9d9
TextDirection.Unknown -> TextDirection.Other
...
Imporve NearestNeighbourWordExtractor for TextDirection.Other
2020-01-31 16:24:59 +00:00
BobLd
75821919a7
Fix NearestNeighbourWordExtractor for rotated text
2020-01-31 16:24:59 +00:00
BobLd
47672d3f90
Make TextBlock.SetReadingOrder(int) public
2020-01-13 09:25:57 +00:00
BobLd
e8216b29c5
Add reading order in PageXml export
2020-01-12 11:15:58 +00:00
BobLd
e7417be75a
ReadingOrderDetector and tidying DLA project
2020-01-11 11:18:11 +00:00
Eliot Jones
e064d39671
remove unused project references from document layout analysis
2020-01-05 15:44:02 +00:00
Eliot Jones
e0a45e3774
include dependencies as dlls in the published nuget
...
by default nuget pack does not include project dependencies. this is suboptimal since it would require managing at least 5 nuget packages. this uses a workaround detailed here https://github.com/nuget/home/issues/3891 to copy the dependent dlls to the generated nuget package. this doesn't resolve the issue of how we publish the documentlayoutanalysis project, since it is the top of the dependency tree and we publish its parent, rather than it.
2020-01-05 13:56:14 +00:00
Eliot Jones
b29354e3e6
move compact font format fonts to fonts project
2020-01-05 12:08:01 +00:00
Eliot Jones
bbde38f656
move tokenizers to their own project
...
since both pdfs and Adobe Type1 fonts use postscript type objects, tokenization is needed by the main project and the fonts project
2020-01-05 10:40:44 +00:00
Eliot Jones
15525acbaa
move document layout analysis and export to new project
2020-01-05 09:19:58 +00:00