BobLd
848d687910
Add kd tree and improve clustering
2020-02-19 11:12:54 +00:00
Eliot Jones
ed894baffc
Merge pull request #135 from BobLd/rxycut-stackoverflow
...
Fix RecursiveXYCut
2020-02-15 10:52:20 +00:00
BobLd
3229574345
Correcting typo
2020-02-12 12:00:14 +00:00
BobLd
6a72ce655c
Merge branch 'master' of https://github.com/UglyToad/PdfPig into rxycut-stackoverflow
2020-02-12 11:40:51 +00:00
BobLd
f0be3e99ee
Add Projection class
2020-02-11 10:04:04 +00:00
BobLd
dd8baa0d2f
Fix stack overflow error in RecursiveXYCut
...
Was linked to the new rotated rectangle methods
2020-02-10 13:53:59 +00:00
BobLd
b49740d093
add test for index increment when children are present
2020-02-10 11:23:19 +00:00
BobLd
ec68231ab3
remove useless using
2020-02-10 11:23:19 +00:00
BobLd
ac1e2c49ba
Fix bounding box for artifact
...
Add tests
2020-02-10 11:23:19 +00:00
BobLd
588648d30b
Fix #133 Marked content extraction issue
2020-02-10 11:23:19 +00:00
BobLd
6dfc7aea30
add test for index increment when children are present
2020-02-09 18:01:18 +00:00
BobLd
f66c25103f
remove useless using
2020-02-09 17:54:20 +00:00
BobLd
905559e282
Fix bounding box for artifact
...
Add tests
2020-02-09 17:46:35 +00:00
BobLd
635693f032
Fix #133 Marked content extraction issue
2020-02-09 15:23:55 +00:00
BobLd
f05101ad07
TransformationMatrix: add comment and improve multiplication
2020-02-09 14:20:47 +00:00
BobLd
1a11e49124
Add TransformationMatrix.Inverse() tests
2020-02-09 14:20:47 +00:00
BobLd
df73206788
try rerun failed tests
2020-02-09 14:20:47 +00:00
BobLd
6a86cdab73
make GeometryExtensions.OrientedBoundingBox() internal
2020-02-09 14:20:47 +00:00
BobLd
288beab39d
limit access to array and add comments
2020-02-09 14:20:47 +00:00
BobLd
7364e53bb9
Improve bounding box for word
2020-02-09 14:20:47 +00:00
Eliot Jones
f3fcd1b3a1
ignore form dictionaries that do not contain fields #131
...
though a form dictionary should always contain fields (as required by the spec) it is possible for this entry to be missing. in this case we return false for trygetform.
2020-02-05 10:56:01 +00:00
Eliot Jones
40dc80c281
handle type 1 font with no descriptor information #132
...
though required by the spec an adobe type 1 font may be missing all width data. in this case we default to empty values and treat it like a normal adobe type 1 font.
2020-02-05 10:46:39 +00:00
Eliot Jones
10ca77a034
move values back to computed properties
...
the additional stored fields made the struct slower to copy and had an impact on the performance. this moves non-essential fields back to computed properties.
2020-01-31 16:24:59 +00:00
BobLd
6dfbd45eb3
Add image example with new bounding box method
2020-01-31 16:24:59 +00:00
BobLd
6d0b14d2a7
Delete nearest neighbour example.png
2020-01-31 16:24:59 +00:00
BobLd
6dce4b1f8d
Use double.NaN instead of double.MinValue
...
Only compute t, cos and sin once
2020-01-31 16:24:59 +00:00
Eliot Jones
b585fe9581
make width, height, area, rotation and centroid lazy
...
since the rectangle constructor is a hot path any calculations slow the library down considerably. for this reason we move calculations for the following properties into the property getter:
* width (cached)
* height (cached)
* rotation
* area
* centroid
where values are cached they set their backing field once calculated. this won't be thread safe if the same rectangle is accessed on multiple threads.
2020-01-31 16:24:59 +00:00
Eliot Jones
0e84fa34a8
fix usages of rectangle constructor.
...
now that rectangle constructor uses the order [ llx, lly, urx, ury ] and does not apply correction for points constructor parameters must be passed in the correct order. this change fixes the hyperlink factory which was passing them in the wrong order.
in addition the pdfpath bounding box was using left, right, top and bottom to calculate the minimum bounding box. this produced incorrect values now individual path operator bounding boxes are rotated, since for a rotated rectangle top may be less than bottom.
the performance seems to have taken a hit due to these changes however.
2020-01-31 16:24:59 +00:00
BobLd
0e613fb526
Handle cases with not enough points in minimal bounding rectangle
2020-01-31 16:24:59 +00:00
BobLd
4c65cbc139
Improve minimum bounding box orientation v2
2020-01-31 16:24:59 +00:00
BobLd
bff18d81ca
Improve minimum bounding box orientation
2020-01-31 16:24:59 +00:00
BobLd
483b30f44d
Remove rounding
2020-01-31 16:24:59 +00:00
BobLd
253ae32193
Remove ordering from minimal bounding rectangle
2020-01-31 16:24:59 +00:00
BobLd
0dad611cb1
Implement minimum bounding box algorithm
2020-01-31 16:24:59 +00:00
BobLd
36c03459a7
first and last letter
2020-01-31 16:24:59 +00:00
BobLd
f221b58936
Remove useless code
2020-01-31 16:24:59 +00:00
BobLd
ea27820ca4
Improve Word bounding box TextDirection.Other case
2020-01-31 16:24:59 +00:00
BobLd
2e5fdb5867
Fix PdfRectangle's Centroid and Translate()
2020-01-31 16:24:59 +00:00
BobLd
adaccf97b3
Add files via upload
2020-01-31 16:24:59 +00:00
BobLd
380c36918b
Remove unnecessary code
2020-01-31 16:24:59 +00:00
BobLd
0cbf3434bc
Remove 'orderFunc' from 'NearestNeighbourWordExtractor' to use the order found by clustering algo
2020-01-31 16:24:59 +00:00
BobLd
3b90370f28
Using Math.Min(letter.Width, letter.GlyphRectangle.Width) for rotated 180 word bounding box
2020-01-31 16:24:59 +00:00
BobLd
c4b6bbc8e5
Using Math.Max(letter.Width, letter.GlyphRectangle.Width) for word bounding box
2020-01-31 16:24:59 +00:00
BobLd
6d8744e722
More decimals to Width and Height
...
+ handle the case where both bottom points are identical
2020-01-31 16:24:59 +00:00
BobLd
bc69376743
Increase max distance for TextDirection.Other in NearestNeighbourWordExtractor
2020-01-31 16:24:59 +00:00
BobLd
a326d7e9d9
TextDirection.Unknown -> TextDirection.Other
...
Imporve NearestNeighbourWordExtractor for TextDirection.Other
2020-01-31 16:24:59 +00:00
BobLd
9bcafdaa98
Update word bounding box computation
2020-01-31 16:24:59 +00:00
BobLd
27edf6cf77
Handle Width and Height for rotated rectangles
2020-01-31 16:24:59 +00:00
BobLd
75bd94e538
Better handling of TextDirection.Unknown word bounding box
2020-01-31 16:24:59 +00:00
BobLd
75821919a7
Fix NearestNeighbourWordExtractor for rotated text
2020-01-31 16:24:59 +00:00