Commit Graph

172 Commits

Author SHA1 Message Date
Eliot Jones
4443cde229 add very hacky parsing for type 1 font files in order to read the encoding 2018-01-14 18:59:03 +00:00
Eliot Jones
615ee88a46 start passing the pdf scanner in to read the type 1 files 2018-01-14 15:33:22 +00:00
Eliot Jones
1fb6ec41d1 remove cos object key completely and test indirect reference 2018-01-14 15:01:18 +00:00
Eliot Jones
36c0eedd7c move the usages of cos object key to indirect reference 2018-01-14 14:48:54 +00:00
Eliot Jones
b19b96604d make the pdf object scanner work with streams 2018-01-14 10:53:01 +00:00
Eliot Jones
8dcea9b37f create a pdf object scanner which sits on top of the core token scanner to provide complete object parsing 2018-01-13 22:30:15 +00:00
Eliot Jones
595c137282 comment out local test 2018-01-13 20:19:34 +00:00
Eliot Jones
95ead2a3de use direct object finder when getting the name of the base font in type 1 and truetype fonts. 2018-01-13 20:06:58 +00:00
Eliot Jones
c88db947ad fix bug with hex names mapping to unicode in the glyph list. 2018-01-12 19:58:54 +00:00
Eliot Jones
02a1c1ed51 more tests for glyphlist 2018-01-11 20:52:13 +00:00
Eliot Jones
ca7cb11561 initial release: name package correctly 2018-01-10 22:35:09 +00:00
Eliot Jones
ba8d2f5b1d fix a bug with tokenization without spaces before string 2018-01-10 22:15:29 +00:00
Eliot Jones
be98b1f8e4 add missing doc comments 2018-01-10 20:31:38 +00:00
Eliot Jones
45d0a8f863 add icon in preparation for deployment 2018-01-10 19:54:31 +00:00
Eliot Jones
ec62542b64 change the project name to something silly 2018-01-10 19:49:32 +00:00
Eliot Jones
ab5a357665 fix bugs with reading documents from microsoft print to pdf 2018-01-10 19:23:10 +00:00
Eliot Jones
674945206c handle case where no unicode value is found for type 1 fonts. 2018-01-09 22:32:18 +00:00
Eliot Jones
b2936660d7 update the readme and document public properties 2018-01-08 23:19:51 +00:00
Eliot Jones
10df612a00 throw informative exception when the document is encrypted 2018-01-08 22:43:48 +00:00
Eliot Jones
bf7d3868da check public api through reflection 2018-01-08 22:08:58 +00:00
Eliot Jones
b1fbcd0ccd support type 1 normal fonts and fix bug with fetching resource dictionary 2018-01-08 21:58:07 +00:00
Eliot Jones
133ab43d45 fix a bug with font reading order in multi-page documents 2018-01-07 20:19:17 +00:00
Eliot Jones
59c36a7ddd fix a bug with t* operator due to incorrect specification. add generic name to type 3 fonts. 2018-01-07 15:38:43 +00:00
Eliot Jones
70361973b3 add support for reading encoding differences from font dictionary. add type 3 font support. 2018-01-07 14:49:17 +00:00
Eliot Jones
18eeb896e0 more documentation and remove unused code 2018-01-07 12:41:55 +00:00
Eliot Jones
bb93484909 encapsulation for internal classes, remove old code, document public api 2018-01-07 12:37:48 +00:00
Eliot Jones
ad1ef0e167 make package nuget publishable 2018-01-07 12:09:23 +00:00
Eliot Jones
a6c3dba25a bug fix for indirect page link, bug fix for array in base font range in cmap 2018-01-07 11:51:18 +00:00
Eliot Jones
c75b9d10bd add test for unusual latin characters and different document producer. 2018-01-06 21:06:26 +00:00
Eliot Jones
d1aa390f01 support reading text from type 1 fonts which use standard 14 fonts 2018-01-06 20:51:20 +00:00
Eliot Jones
96d787e498 finish adobe font metrics parser 2018-01-06 19:48:07 +00:00
Eliot Jones
02845e8ebb handle case where contents is an array of objects 2018-01-06 18:25:47 +00:00
Eliot Jones
eb66611e55 checkpoint check in for adobe font metrics parsing 2018-01-06 14:11:14 +00:00
Eliot Jones
03f31a84e5 tests for end of line tokenizer and branch coverage for string tokenizer 2018-01-06 12:08:52 +00:00
Eliot Jones
bbcb5af2be fix bugs revealed by mortality metadata document. get references using the direct object finder. fix a bug with string tokenizer 2018-01-05 23:08:20 +00:00
Eliot Jones
2e7f9b8d76 add a more complicated type 2 font pdf and tests 2018-01-04 21:25:49 +00:00
Eliot Jones
6b4bd8689f start migrating cross reference parsing process to token scanner 2018-01-04 21:09:47 +00:00
Eliot Jones
1c41618950 cover missing line in file trailer parsing 2018-01-03 22:51:44 +00:00
Eliot Jones
1aacb14285 add test for multiple page pdf from libre office 2018-01-03 22:46:26 +00:00
Eliot Jones
21be34a938 substitute the token scanner into the file trailer parsing and test 2018-01-03 22:29:09 +00:00
Eliot Jones
f09ef85e5a make tokenizer classes internal and change the file header to use a scanner rather than the pdfbox type reader 2018-01-03 20:15:25 +00:00
Eliot Jones
72ffa1f308 make byte array input bytes behave correctly when seeking rather than forcing the consumer to work around it. 2018-01-03 19:17:12 +00:00
Eliot Jones
bfdca3079f change the itext document test to reflect its text being form content. fix readme typo 2018-01-03 19:13:12 +00:00
Eliot Jones
0ef33f5215 move catalog parsing to its own factory. parse document information if present and expose publically. add test for itext generated document 2018-01-02 23:26:58 +00:00
Eliot Jones
8b8f2941a5 undo debugging code 2018-01-02 22:23:50 +00:00
Eliot Jones
5ab8d69ea5 fix bug with computing text positions 2018-01-02 22:23:08 +00:00
Eliot Jones
d03c04cca1 test coverage for parsing glyph lists and fix bug with octal conversion 2018-01-01 18:39:18 +00:00
Eliot Jones
d7b9a9d559 implement glyph list for mapping from character code to name to unicode 2018-01-01 17:23:32 +00:00
Eliot Jones
c34bdac92a add test for non latin characters and use normal ints rather than octal in the encoding classes. 2018-01-01 13:49:24 +00:00
Eliot Jones
874f713566 load resources dictionary from pages as well as page node and throw informative error when the font is not found. 2018-01-01 10:49:05 +00:00