Commit Graph

32 Commits

Author SHA1 Message Date
Eliot Jones
6b9c3be9f8 tidy up some small formatting issues 2023-05-21 12:20:57 +01:00
mvantzet
0e39bc0b76
Annotations named destinations (#579)
* Add Named Destinations to Catalog so that bookmarks and links can access
them.

The named destinations require access to page nodes, so created Pages object
that is made using PagesFactory (which contains the page-related code from
Catalog).

* Further implementation of destinations:
- Implement NamedDestinations in AnnotationProvider, so that we can look
  up named destinations for annotations and turn them into explicit destinations.
  Reused existing code inside BookmarksProvider to get destinations/actions.
- Added GoToE action
- According to the PDF reference, destinations are also required for
  external destinations and hence for ExternalBookmarkNode. This allows us
  to push up DocumentBookmarkNode.Destination to BookmarkNode.

* Implemented stateful appearance streams and integration test

* Added AppearanceStream to public API because it is used in the (public)
Annotation constructor

* After #552, must push down ExplicitDestination do DocumentBookmarkNode since it
does not apply to UriBookmarkNode.

* Added actions, which fits the PDF model better and works well with the
new bookmarks code (after PR #552)

* Rename Action to PdfAction + removed unused using in ActionProvider.cs

---------

Co-authored-by: mvantzet <mark@radialsg.com>
2023-04-10 17:14:14 +01:00
Eliot Jones
e2246a88bb #482 add skip missing fonts option and pass parsing options to content stream processor
this doesn't fix the reported issue since the pdf itself is corrupted on page 8 however it will
allow recovery in some scenarios where text content isn't important.

also adds more informative error when stream unintentionally passed with non zero offset
2022-10-09 13:44:05 -04:00
Eliot Jones
9ae0a5ec15 allow stream filters to contain indirect references to name tokens 2021-04-25 16:22:22 -04:00
BobLd
a759a99389 Move ClipPaths option from GetPage() to ParsingOptions 2020-04-05 17:58:57 +01:00
BobLd
51165dc11a Implement EndPath
Make path clipping optional
2020-04-05 17:58:57 +01:00
Eliot Jones
c864fa512c remove islenientparsing from page classes 2020-02-28 11:50:18 +00:00
Eliot Jones
7c0ef111ea move classes to new projects
to make the project more useful and expose more usable classes we're rearchitecting in the following way. code used to read fonts from external file formats like truetype, adobe font metrics (afm) and adobe type 1 fonts are moving to a new project which doesn't reference most of the pdf logic. the shared logic is moving to a new flat-structured project called core. this is a sort-of onion type architecture, with core being the... core, fonts being the next layer of the onion, pdfpig itself the next. this will then support additional libraries/projects as outer layers of the onion as well as releasing standalone version of the font library as pdfbox does with fontbox.
2020-01-04 16:38:18 +00:00
Eliot Jones
e984180b3d add method to retrieve any embedded files 2019-12-21 16:16:36 +00:00
Eliot Jones
ecf0b8743b make bookmarknode immutable and use scanner when retrieving bookmarks 2019-12-05 12:03:30 +00:00
Eliot Jones
928347bcce
merge pull request #84 from BobLd/master
add basic bookmarks extraction capabilities.
2019-12-04 14:24:10 +00:00
Eliot Jones
80f024dbed make form access public 2019-11-27 16:36:25 +00:00
BobLd
a8559c1167 Add basic bookmarks extraction capabilities. 2019-11-04 15:11:54 +00:00
Eliot Jones
2ef45f71d5 make missing acroform types public and start improving data
also changes pages to use a proper tree structure since this will be required for resource inheritance and for acroform widget dictionaries.
2019-10-09 14:28:37 +01:00
Eliot Jones
68bcaf3901 #55 move support for images to page and add inline images
support both xobject and inline images. adds unsupported filters so that exceptions are only thrown when accessing lazily evaluated image.bytes property rather than when opening the page.

treat all warnings as errors.
2019-10-08 14:04:36 +01:00
Eliot Jones
0cd7795bff add method to get all pages from document 2019-08-23 19:09:33 +01:00
Eliot Jones
0349bedd3e #57 add access to document metadata and expose wrapper type 2019-08-11 12:42:30 +01:00
Eliot Jones
23c033c788 implement validation of owner password and throw more descriptive exception for encrypted documents 2019-05-09 19:02:39 +01:00
Eliot Jones
be394f5bba start adding support for reading encrypted documents 2019-05-04 15:36:13 +01:00
Eliot Jones
20e843f5ae #24 start adding classes for the acroform api 2019-01-01 17:44:46 +00:00
Eliot Jones
997979cc92 #11 early access to the raw xobjects for images.
temporary 'safe' untested implementation of seac for type 1 charstrings.
make structure public
bump version of package and project to 0.0.3 (it had accidentally increased to 0.0.5)
2018-11-26 19:46:41 +00:00
Eliot Jones
2fa781b8e9 #10 make all token classes public and expose via a public structure member on pdf document 2018-11-24 19:02:06 +00:00
Eliot Jones
31ea1bbd33 #16 #19 tidy up letter api and check/fix type 1 cff positions 2018-11-24 14:38:44 +00:00
modest-as
6a9e466c8b Moved some internal properties to fields 2018-03-30 22:02:05 +01:00
Eliot Jones
c64e54d6c0 support opening from stream and improve performance of brute force searching since the seek operation is now slower. 2018-01-21 19:34:21 +00:00
Eliot Jones
07161cef28 unify raw byte access method 2018-01-21 18:08:00 +00:00
Eliot Jones
7d90f4858a continue migrating code to tokenizer 2018-01-20 18:42:29 +00:00
Eliot Jones
c5e3ce7ec7 finish moving all parsing to token scanner 2018-01-20 00:49:53 +00:00
Eliot Jones
615ee88a46 start passing the pdf scanner in to read the type 1 files 2018-01-14 15:33:22 +00:00
Eliot Jones
ba8d2f5b1d fix a bug with tokenization without spaces before string 2018-01-10 22:15:29 +00:00
Eliot Jones
be98b1f8e4 add missing doc comments 2018-01-10 20:31:38 +00:00
Eliot Jones
ec62542b64 change the project name to something silly 2018-01-10 19:49:32 +00:00