PdfPig

lsm/PdfPig

mirror of https://github.com/UglyToad/PdfPig.git synced 2025-09-20 20:07:57 +08:00

Author	SHA1	Message	Date
Eliot Jones	4150881be9	recover from invalid acro-form references we add a try/catch to the direct object finder's tryget method so it returns false rather than throwing. if we have an acro-form reference in the catalog but no corresponding object in the document we instead scan all objects in the document to find form fields and reconstruct the acro-form dictionary.	2020-02-27 12:08:40 +00:00
Eliot Jones	693a3d5958	use offset to file header to correct cross references if the %pdf version header comment is offset from the start of the file the cross reference offsets will also be wrong by that amount. this change updates the cross reference location logic to use the offset from the located version header.	2020-01-26 15:30:20 +00:00
Eliot Jones	63b118b141	handle type1 fonts disguised as truetype if the font descriptor uses the fromsubtype flag the actual type of the font can differ from that specified in the font dictionary. in this case a truetype font actually contains a type1c, compact font format, font. in this case we fall back to using the type1 parser. also handles a closesubpath command appearing without any path construction operators.	2020-01-07 16:49:21 +00:00
Eliot Jones	0b048fde57	handle eof further back in file an %%eof for a pdf file may appear further back than the last 1024 bytes. this change doubles the search range. it also handles an empty differences array being defined for a font encoding. we also remove the old approach to dependency injection from the code since we are now favouring static classes where possible.	2020-01-07 11:48:09 +00:00
Eliot Jones	b29354e3e6	move compact font format fonts to fonts project	2020-01-05 12:08:01 +00:00
Eliot Jones	74774995d6	complete move of truetype, afm and standard14 fonts the 3 font types mentioned are moved to the new fonts project, any referenced types are moved to the core project. most truetype classes are made public #8.	2020-01-04 22:39:13 +00:00
Eliot Jones	7c0ef111ea	move classes to new projects to make the project more useful and expose more usable classes we're rearchitecting in the following way. code used to read fonts from external file formats like truetype, adobe font metrics (afm) and adobe type 1 fonts are moving to a new project which doesn't reference most of the pdf logic. the shared logic is moving to a new flat-structured project called core. this is a sort-of onion type architecture, with core being the... core, fonts being the next layer of the onion, pdfpig itself the next. this will then support additional libraries/projects as outer layers of the onion as well as releasing standalone version of the font library as pdfbox does with fontbox.	2020-01-04 16:38:18 +00:00
Eliot Jones	4d697e3669	allow the user to supply multiple passwords for decryption previously the only way to test if a password was correct was to supply a single password and throw if the value was incorrect. this was slow. now parsing options supports a list of passwords as well as a single password option (which is equivalent to a list with a single item). these passwords are all tested at the same time and an exception is only thrown once all passwords are tested.	2019-12-20 15:11:05 +00:00
Eliot Jones	c30cd1b96d	use cid font subroutines where applicable. add ucs 2 cmap support for type 1 fonts * cid cff fonts have multiple sub-fonts and multiple private dictionaries, in addition to a top level font and private dictionary. this fix uses the specific sub-dictionary when getting local subroutines on a per-glyph basis. * chinese, japanese or korean fonts can use a ucs-2 encoding cmap for retrieving unicode values. * add support for the additional glyph list for unicode values in true type fonts. adds nonmarkingreturn mapping to carriage return. * makes font parsing classes static where there's no reason for them to be per-instance.	2019-12-19 13:33:44 +00:00
Eliot Jones	ecf0b8743b	make bookmarknode immutable and use scanner when retrieving bookmarks	2019-12-05 12:03:30 +00:00
Eliot Jones	2ef45f71d5	make missing acroform types public and start improving data also changes pages to use a proper tree structure since this will be required for resource inheritance and for acroform widget dictionaries.	2019-10-09 14:28:37 +01:00
Eliot Jones	68bcaf3901	#55 move support for images to page and add inline images support both xobject and inline images. adds unsupported filters so that exceptions are only thrown when accessing lazily evaluated image.bytes property rather than when opening the page. treat all warnings as errors.	2019-10-08 14:04:36 +01:00
Eliot Jones	d98b8b43c1	small performance tweaks and remove package license expression package license url is deprecated in favour of package license expression but nuget doesn't seem to support expressions properly for published packages yet so we'll keep the deprecated url for the time being. having both url and expression causes the build to fail. small obvious performance improvements for file header passing and getting the encoding information using the existing reverse name to code map.	2019-08-18 13:47:01 +01:00
Eliot Jones	0349bedd3e	#57 add access to document metadata and expose wrapper type	2019-08-11 12:42:30 +01:00
Eliot Jones	23c033c788	implement validation of owner password and throw more descriptive exception for encrypted documents	2019-05-09 19:02:39 +01:00
Eliot Jones	bad57763a1	finish initial support for rc4 encryption with blank user password	2019-05-06 15:41:29 +01:00
Eliot Jones	be394f5bba	start adding support for reading encrypted documents	2019-05-04 15:36:13 +01:00
Eliot Jones	245efae8ed	fixes various font handling issues for type 1 and truetype fonts handle "unionsq" and other tricky glyph names. log missing glyphs. ignore flexpoints in type 1 subroutines. improve system font performance and substitution. handle truetype fonts using standard 14 fonts.	2019-01-12 13:54:16 +00:00
Eliot Jones	20e843f5ae	#24 start adding classes for the acroform api	2019-01-01 17:44:46 +00:00
Eliot Jones	47e49c4044	#9 fix bug with truetype fonts and start adding support for cid fonts using compact font format	2018-12-28 22:34:47 +00:00
Eliot Jones	ed3792c950	#20 support retrieval of named system fonts for truetype on windows	2018-12-22 18:28:49 +00:00
Eliot Jones	997979cc92	#11 early access to the raw xobjects for images. temporary 'safe' untested implementation of seac for type 1 charstrings. make structure public bump version of package and project to 0.0.3 (it had accidentally increased to 0.0.5)	2018-11-26 19:46:41 +00:00
Eliot Jones	2fa781b8e9	#10 make all token classes public and expose via a public structure member on pdf document	2018-11-24 19:02:06 +00:00
Eliot Jones	0f68dfeb19	#10 move tokens to the root namespace for discoverability. upgrade xunit versions. there is a bug with test discovery for stringtokenizertests	2018-11-16 20:00:12 +00:00
Eliot Jones	b51ebfd70c	fix a bug where the mediabox being defined on the pages node would throw. a lot more work on parsing compact font format	2018-04-29 14:42:54 +01:00
Eliot Jones	85d1f84965	more compact font format parsing	2018-04-28 19:33:50 +01:00
Eliot Jones	1deefdc987	begin implementing support for compact font format fonts in type 1 font handling	2018-04-28 13:00:43 +01:00
Eliot Jones	1fe54c5f49	add xobjects to pages, fix parsing truetype fonts where the glyphs use the repeat flag.	2018-04-26 22:22:29 +01:00
Eliot Jones	7af2b1bcb9	start adding code and tests for reading metrics of type 1 fonts	2018-04-11 22:51:31 +01:00
Eliot Jones	c64e54d6c0	support opening from stream and improve performance of brute force searching since the seek operation is now slower.	2018-01-21 19:34:21 +00:00
Eliot Jones	07161cef28	unify raw byte access method	2018-01-21 18:08:00 +00:00
Eliot Jones	3172596b7c	remove all old cos objects	2018-01-21 14:56:50 +00:00
Eliot Jones	e24a306c31	remove all old parsing logic	2018-01-21 14:48:49 +00:00
Eliot Jones	7d90f4858a	continue migrating code to tokenizer	2018-01-20 18:42:29 +00:00
Eliot Jones	c5e3ce7ec7	finish moving all parsing to token scanner	2018-01-20 00:49:53 +00:00
Eliot Jones	a0deab446b	switch classes still using the cos object approach to the tokenization approach initally used for parsing cmap files.	2018-01-19 00:35:04 +00:00
Eliot Jones	4443cde229	add very hacky parsing for type 1 font files in order to read the encoding	2018-01-14 18:59:03 +00:00
Eliot Jones	615ee88a46	start passing the pdf scanner in to read the type 1 files	2018-01-14 15:33:22 +00:00
Eliot Jones	36c0eedd7c	move the usages of cos object key to indirect reference	2018-01-14 14:48:54 +00:00
Eliot Jones	ec62542b64	change the project name to something silly	2018-01-10 19:49:32 +00:00

40 Commits