Commit Graph

15 Commits

Author SHA1 Message Date
Eliot Jones
9028f932b2 #83 decrypt dictionary hex values 2019-11-25 12:42:32 +00:00
Eliot Jones
52d2a90dfc finish revision 5 and 6 owner password handling #34
moves owner password check first. correctly calculates encryption key for owner password for revision 5 and 6.
2019-06-25 19:44:26 +01:00
Eliot Jones
76f8222f74 start adding support for undocumented revision 6 encryption
revision 6 was added in the pdf 2.0 specification which is document iso 32000-2:2017. because iso are rent-seeking they charge money to view this specification so it is effectively undocumented. this site details some of the algorithm https://web.archive.org/web/20180311160224/esec-lab.sogeti.com/posts/2011/09/14/the-undocumented-password-validation-algorithm-of-adobe-reader-x.html. the code in this change ports the pdfbox logic line by line. it doesn't implement the correct behaviour for owner password yet.
2019-06-24 20:37:25 +01:00
Eliot Jones
f86c2545bd treat encryption entries as optional for revisions 5+ #34
the revision 5 and 6 encryption algorithms specify the presence of additional encryption material named 'oe' and 'ue'. it turns out this is not always required so will now default to null if not present. this also adds support for those values being in hex rather than normal string format.

tidies up some commenting on the xynode class, moves public methods below constructors and adds xy to the resharper list of abbreviations for the solution.
2019-06-23 13:52:12 +01:00
Eliot Jones
ff9e2ad83f handle hex registry and ordering. decrypt hex tokens #34
cid fonts may contain a registry, ordering and supplement to identify the font. we were checking for string registry and ordering tokens but failing on hex tokens.

for encrypted documents we now decrypt hex data.
2019-06-23 13:27:32 +01:00
Eliot Jones
0f103554fb handle non-standard crypt dictionary type and use hex bytes for password #34
using an online tool to encrypt a simple document with aes-128 seems to add the dictionary type cryptalgorithm rather than cryptfilter. i couldn't find any references to cryptalgorithm in the spec or pdfbox but it seems to work ok when treated as equivalent to cryptfilter.

there are situations where the string derived from a hex token has a different length to the underlying bytes, for example if the hex token contains the '\0' byte, the encryption algorithm needs to use the raw bytes rather than the 'stringified' bytes. this change passes raw bytes for hex tokens for both the user and owner password keys.
2019-06-23 13:12:47 +01:00
Eliot Jones
4c716fcbd6 finish support for revision 5 encryption using aes 256 #34 2019-06-13 19:46:08 +01:00
Eliot Jones
d0a3cd398f start adding support for revision 5 aes-256 encrypted documents #34 2019-06-09 13:27:03 +01:00
Eliot Jones
21a4ba597e add support for aes-128 decryption #34 2019-06-08 15:23:21 +01:00
Eliot Jones
a19122478d begin adding support for in-document security handlers to support aes 128/256 encryption #34 2019-06-08 14:14:51 +01:00
Eliot Jones
23c033c788 implement validation of owner password and throw more descriptive exception for encrypted documents 2019-05-09 19:02:39 +01:00
Eliot Jones
53811a7d97 verify password against user password or throw 2019-05-07 18:53:51 +01:00
Eliot Jones
bad57763a1 finish initial support for rc4 encryption with blank user password 2019-05-06 15:41:29 +01:00
Eliot Jones
27928cd3a3 add rc4 tokenizer and key generation per object 2019-05-05 15:34:48 +01:00
Eliot Jones
be394f5bba start adding support for reading encrypted documents 2019-05-04 15:36:13 +01:00