BobLd
0dad611cb1
Implement minimum bounding box algorithm
2020-01-31 16:24:59 +00:00
Eliot Jones
b29354e3e6
move compact font format fonts to fonts project
2020-01-05 12:08:01 +00:00
Eliot Jones
1c38a2ae8a
move pdfline to the core project
2020-01-05 09:33:59 +00:00
Eliot Jones
7c0ef111ea
move classes to new projects
...
to make the project more useful and expose more usable classes we're rearchitecting in the following way. code used to read fonts from external file formats like truetype, adobe font metrics (afm) and adobe type 1 fonts are moving to a new project which doesn't reference most of the pdf logic. the shared logic is moving to a new flat-structured project called core. this is a sort-of onion type architecture, with core being the... core, fonts being the next layer of the onion, pdfpig itself the next. this will then support additional libraries/projects as outer layers of the onion as well as releasing standalone version of the font library as pdfbox does with fontbox.
2020-01-04 16:38:18 +00:00
BobLd
d246bf5c74
- remove unnecessary casts
...
- make PageXmlTextExporter.Deserialize() public
2019-12-31 10:43:07 +00:00
Eliot Jones
935d182888
use doubles where calculations are being run
2019-12-24 12:22:17 +00:00
Eliot Jones
1fb416eee3
add convenience method to retrieve all hyperlinks and their text from annotations on a page
2019-12-18 11:41:02 +00:00
BobLd
5cf1f6c58c
Modifications and adding som tests
2019-12-16 14:36:52 +00:00
BobLd
1656411fcb
Improving Geometry classes with Tests
2019-12-14 11:41:11 +00:00
Eliot Jones
09b26c43e0
#74 add intersectswith method to rectangle
2019-10-17 11:21:49 +01:00
BobLd
fe1a3c4b8b
updated from comments
...
- still need to look at XmlWriter
2019-10-10 12:29:28 +01:00
Eliot Jones
81ab414c56
add is supported flag to filters and add missing doc comment
2019-10-08 15:53:42 +01:00
BobLd
bf09aee99c
Adding images regions
2019-10-08 15:29:18 +01:00
BobLd
93313118e9
Support for hORC, AtloXml and PageXml output formats
...
Tested with:
- 'hocrjs' for hORC (see https://unpkg.com/hocrjs )
- 'PAGE Viewer' for hORC, AtloXml and PageXml (see http://www.primaresearch.org/tools/PAGEViewer )
2019-10-07 15:19:30 +01:00
BobLd
1c3519fd51
Update PdfPath.cs
...
Need to account the case where a `Close` command is called but the first and last commands are not connected.
2019-10-06 12:47:12 +01:00
BobLd
1975db4752
correct typo
2019-10-04 14:50:22 +01:00
BobLd
5d3e4cd4e1
Improve PdfPath
...
- Determine if Closed path
- Determine if Clockwise or CounterClockwise
- Add Centroid
2019-10-04 14:37:41 +01:00
Eliot Jones
3fbfc1130e
lazily evaluate centroid of rectangle
2019-08-20 23:03:27 +01:00
Eliot Jones
8c100efe04
Merge pull request #60 from BobLd/master
...
Improve ClusteringAlgorithms.GroupIndexes() and add Equals() to PdfLine
2019-08-17 12:58:06 +01:00
BobLd
afa2b7baa1
Improve ClusteringAlgorithms.GroupIndexes()
...
Add Equals() to PdfLine
2019-08-14 19:58:31 +01:00
Eliot Jones
f5e025aa70
merge pull request #58 from uglytoad/colors
...
adds colors to letters and prepares code to add colors to paths.
2019-08-13 20:50:06 +01:00
BobLd
9b24223190
Removing ToDouble()
2019-08-10 13:52:01 +01:00
BobLd
bd58879e32
Update from comments
2019-08-10 13:05:25 +01:00
BobLd
474ce9a442
Improving PdfPoint
2019-08-09 19:58:48 +01:00
BobLd
ac065e988a
Adding Centroid to PdfRectangle.
2019-08-09 17:22:16 +01:00
Eliot Jones
c5d03bca97
move application of transformation matrix outside path
2019-08-08 21:19:18 +01:00
Eliot Jones
364bd25fa8
#48 add handling of inline image data to pdf content parsing
...
an inline image in a pdf content stream starts with the bi tag, then id declares the start of image data and ei the end. attempting to parse the bytes after the id tag as usual resulted in errors. this change adds special case handling for inline images.
2019-08-03 15:42:19 +01:00
vadimy
7d3a0929b6
Refactoring and fixing according to Eliot's comments
2019-07-24 00:00:00 -04:00
vadimy
b9d0cca2a6
Added "Paths" collection to Page object.
...
Added matrix transformation to path operators.
2019-07-16 00:35:29 -04:00
Eliot Jones
7a3b89ece1
tidy up some doc comments
2019-05-18 12:28:42 +01:00
BobLd
f4ec425bf0
- Correction of the PdfLine's length formula;
...
- Moving Line to TextLine
2019-05-15 19:44:47 +01:00
BobLd
97f0f6fe75
Minor modifications and updates
2019-05-14 20:56:34 +01:00
BobLd
2011d504a7
In Content:
...
- Adding a 'Line' of text object
- Adding a 'TextDirection' property in the 'Word' object
In Geometry:
- Adding a 'PdfLine' object
- Making the 'PdfRectangle' creator public
2019-05-12 19:34:00 +01:00
BobLd
214ef8a958
Fix issue with Width, Height and Area when the texxt orientation is not horizontal.
2019-04-19 20:12:31 +01:00
Eliot Jones
5c8a77bf33
#26 make almost all operators public
2019-01-03 22:20:53 +00:00
Eliot Jones
9a1879829d
move path class and add doc comments to document creation api
2018-12-25 10:37:00 +00:00
Eliot Jones
d5a50f2236
#8 tidy up truetype font internally. some more work on a potential document creation api
2018-11-25 13:56:27 +00:00
Eliot Jones
3f3badb7b4
#12 performance optimizations for type 1 fonts and other tweaks
2018-11-25 11:37:00 +00:00
Eliot Jones
3a025052c9
add test for type 1 cff glyph locations and add missing doc comments
2018-11-19 21:43:22 +00:00
Eliot Jones
4d18a2478d
add charset interface, create class to store cff font data. add the command logic for type 2 charstrings #6
2018-11-17 14:59:58 +00:00
Eliot Jones
9700ee03b8
Merge pull request #3 from GowenGit/master
...
letter boundaries - todo: review changes for non latin characters, it seems like we need both the bounding box and the origin to be stored for each letter since the origin is on the baseline while the bounding box can extend below.
2018-04-15 11:41:30 +01:00
modest-as
4570427a6f
merge from upstream
2018-04-14 22:20:36 +01:00
Eliot Jones
8def7d7e0b
unify truetype glyphs to a single class. build composite glyphs from elements
2018-04-14 22:16:26 +01:00
Eliot Jones
983933b6e8
tidy up glyph reading code, simple glyphs now contain glyph points rather than 3 related arrays
2018-04-14 15:44:07 +01:00
Eliot Jones
ebdda46098
make rectangle a struct. add infrastructure for handling composite glyphs
2018-04-14 14:11:10 +01:00
modest-as
564e32e072
Return bounding boxes for letters
2018-03-30 23:16:54 +01:00
Eliot Jones
70025edd79
truetype glyphs now contain the bounds
2018-03-30 21:26:21 +01:00
Eliot Jones
ec62542b64
change the project name to something silly
2018-01-10 19:49:32 +00:00