Commit Graph

903 Commits

Author SHA1 Message Date
BobLd
a53d87a6f5 Fix KdTree.FindNearestNeighbours(k) returning the pivot itself 2020-04-28 22:01:50 +01:00
BobLd
1a3edabd1f Set ClipPaths to false if no ParsingOptions given (consistent behaviour) 2020-04-27 12:51:02 +01:00
Eliot Jones
09b951f667 expose font details on individual letters
also fixes a regression for image extraction
2020-04-25 17:15:26 +01:00
Eliot Jones
98dd736f94 0.1.2-alpha001 0.1.2-alpha001 2020-04-25 15:20:07 +01:00
Eliot Jones
ae62197178 merge pull request #167 from bobld/master
Improve NearestNeighbourWordExtractor
2020-04-25 15:07:18 +01:00
Eliot Jones
e264583c21 add merging to the readme 2020-04-25 12:06:11 +01:00
Eliot Jones
19047f62ae fix name output for merged documents 2020-04-25 11:23:37 +01:00
Eliot Jones
391b650e3c add more examples to the examples solution 2020-04-25 10:14:05 +01:00
BobLd
ae82c30a31 Merge branch 'master' of https://github.com/UglyToad/PdfPig 2020-04-25 10:10:08 +01:00
Eliot Jones
27e251f921 make filter provider and filter public and use tryget for image bytes 2020-04-25 09:42:24 +01:00
Eliot Jones
635c4b4c5e formatting tidy-up 2020-04-25 09:11:16 +01:00
BobLd
c2de52423e Make NearestNeighbours public 2020-04-25 08:40:43 +01:00
BobLd
d4210cd5d1 Make clustering algos public and use shorter names 2020-04-25 08:40:43 +01:00
Adam Busbin
00b9d416df added check for bad fonts see 61ceca8376/fontbox/src/main/java/org/apache/fontbox/ttf/HorizontalMetricsTable.java line 67 for matching code. 2020-04-25 08:40:12 +01:00
BobLd
0a6ec3946b NearestNeighbourWordExtractor:
- Improve results by using PointSize
- Make 'filterFunction' public for ad hoc GetWords()
- Allow text in different direction

Make Letter.PointSize public and add warning (needed for NNWordExtractor)
Remove Page.GetPointSize(Letter letter)
2020-04-20 13:09:35 +01:00
BobLd
8eb50517dd Merge branch 'master' of https://github.com/UglyToad/PdfPig 2020-04-20 00:08:09 +01:00
Eliot Jones
48aa9169c5 add whitespace size statistics to api scanner test 2020-04-19 17:10:46 +01:00
Eliot Jones
407ee5ca51 add content order text extractor and example of use 2020-04-19 17:06:34 +01:00
BobLd
25c8560a96 Merge branch 'master' of https://github.com/UglyToad/PdfPig 2020-04-19 10:51:00 +01:00
Eliot Jones
f18bc0766a #161 handle zero point size by using rotated matrix 2020-04-19 10:28:11 +01:00
BobLd
9d4bd69d79 Merge branch 'master' of https://github.com/UglyToad/PdfPig 2020-04-18 20:17:20 +01:00
Eliot Jones
23087fe38c add extract images example 2020-04-18 18:55:44 +01:00
Eliot Jones
563985e2b4 add examples directory and first example 2020-04-18 18:46:26 +01:00
Eliot Jones
25314cc79d #161 change rotation to fix values and page size
this doesn't account for images and pdf paths yet.
2020-04-18 18:04:41 +01:00
Eliot Jones
b122bf0ca6 inline transformation code and cache afm strings 2020-04-18 13:56:39 +01:00
Eliot Jones
db442194c3 use a mutable struct 2020-04-18 12:10:17 +01:00
BobLd
93c19d72ec improve mbr by removing PdfPoints creation 2020-04-18 11:09:49 +01:00
Eliot Jones
e382e581ba add merge test for document with object stream 2020-04-16 20:57:57 +01:00
Eliot Jones
635ae13a77 add pdf/a2-a support 2020-04-16 20:50:21 +01:00
BobLd
ab94746252 Improve default RecursiveXYCut dominant font width and height functions 2020-04-13 10:25:34 +01:00
BobLd
395c5a7fd9 Improve default RecursiveXYCut dominant font width and height functions 2020-04-12 19:49:30 +01:00
BobLd
3833fff28f Merge branch 'master' of https://github.com/UglyToad/PdfPig 2020-04-12 16:13:00 +01:00
BobLd
b462c3bea4 update PublicApiScannerTests 2020-04-12 10:15:12 +01:00
BobLd
88c5aeebd5 Make DateFormatHelper public 2020-04-12 10:15:12 +01:00
BobLd
416d980a5c update PublicApiScannerTests 2020-04-10 15:33:37 +01:00
BobLd
48ad309ede Make DateFormatHelper public 2020-04-10 14:38:59 +01:00
Eliot Jones
75c04eb81c fix namespace using order 2020-04-05 17:58:57 +01:00
Eliot Jones
2a0a3fae69 add test for svg exporter and escape xml characters 2020-04-05 17:58:57 +01:00
Eliot Jones
45ac8c8a60 default clipping to false for performance 2020-04-05 17:58:57 +01:00
Eliot Jones
2998b1b2eb tidy up clipper related code 2020-04-05 17:58:57 +01:00
BobLd
60595e1341 Check if current path is null for path construction operators 2020-04-05 17:58:57 +01:00
BobLd
9a2a72ea2e remove clipping to fix CanGetArtifact() test 2020-04-05 17:58:57 +01:00
BobLd
270b1b5ef5 add comment about bezier curves/polylines when clipping 2020-04-05 17:58:57 +01:00
BobLd
e2b03c3075 implement clipping using Clipper (Boost Software License) 2020-04-05 17:58:57 +01:00
BobLd
a759a99389 Move ClipPaths option from GetPage() to ParsingOptions 2020-04-05 17:58:57 +01:00
BobLd
c14646fcc0 Fix Mode() 2020-04-05 17:58:57 +01:00
BobLd
ec2dcdc9f4 Check if CurrentSubpath is null in CloseSubpath() 2020-04-05 17:58:57 +01:00
BobLd
16a17f3b8d update IsClosed() with Move 2020-04-05 17:58:57 +01:00
BobLd
d804a63905 update comments 2020-04-05 17:58:57 +01:00
BobLd
b923a42f9e Check if CurrentSubpath null before CloseSubpath 2020-04-05 17:58:57 +01:00