Commit Graph

812 Commits

Author SHA1 Message Date
InusualZ
44ad5c8b0c PdfStreamWriter: Error Checking and Code Formatting 2020-03-15 16:05:23 +01:00
InusualZ
c533d47386 New class PdfStreamWriter
This class would allow us to lazily flush resource. This would allow us to make changes to them, while new content is pushed. Compress, Merge, Deduplicate, etc...
2020-03-15 16:05:23 +01:00
InusualZ
be7716eeea Make IToken implement IEquatable<IToken>
This would allow us to deduplicate tokens, by comparing their content
2020-03-15 16:05:23 +01:00
BobLd
9366aa2b37 Tidy up code 2020-03-15 15:39:19 +01:00
BobLd
5b8a2f2e38 handle k-nearest neighbours search
update DocstrumBB with kd-tree
2020-03-15 15:39:19 +01:00
BobLd
8cafda3577 handle nearest neighbour not found 2020-03-15 15:39:19 +01:00
Eliot Jones
aa9df30722 handle invalid charstring sequences
it is possible for a file with an adobe type 1 font to contain an invalid charstring sequence, if this happens we handle it and return false from trygenerate.
2020-03-08 14:33:26 +00:00
Eliot Jones
8df2f9cf6b generate all xml docs and pack them #148
after we split the solution into multiple projects the xml doc comments were no longer packed in the generated nuget package. in addition they were only generated for the net standard 2.0 target framework.

this change generates comments for all target frameworks and makes sure they're included in the generated package. it also adds missing doc comments where they weren't included on the public api and clears up a couple of minor formatting issues in the affected files.
2020-03-08 13:44:09 +00:00
Eliot Jones
24c5cbea4b support custom page sizes for document builder #147
page size custom is not supported for the document builder so a new overload which supports user defined page sizes is provided.
2020-03-07 16:48:19 +00:00
InusualZ
ab3779e644 Fix an issue where the root/Pages Count was incorrect
`/Pages` Count should reflect the number leaf nodes (page objects) that are descendants of this node.
2020-03-07 16:47:35 +00:00
BobLd
439dad9f35 fix depth for leaves 2020-03-06 16:53:18 +00:00
BobLd
dbdef7f195 kd-tree 2020-03-06 16:53:18 +00:00
BobLd
c4309ef31e fix kd-tree 2020-03-06 16:53:18 +00:00
BobLd
1e1a33d46e fix typo in kd-tree
replace Count() by Count or Length
2020-03-06 16:53:18 +00:00
BobLd
e477bc8d6d Tidy up and optimise kd-tree 2020-03-06 16:53:18 +00:00
Eliot Jones
bef68a0654 avoid infinite loop in brute-force searcher #88
fixes the case where the brute-force searcher becomes stuck in an infinite loop, it may be the case that the problem pdf from #88 has a newline or some other whitespace between its object and generation number so this may cause a failure elsewhere.
2020-03-03 15:49:17 +00:00
Eliot Jones
58972de7cb begin to rework cross-reference parsing
most of the cross-reference code is the earliest code in the project and hasn't been revisited since then. the issue #88 has been reopened due to a bug with brute-force searching so this tidies up the code in this area ahead of trying to fix the bug.
2020-03-03 15:21:11 +00:00
Eliot Jones
4b5c8d510e add test for comment in dictionary from #145
check that we correctly handle the case where a comment appears inside a dictionary, this was handled by commit 3084a9. use list internally to dictionary tokenizer to avoid interface performance penalties.
2020-03-03 11:36:01 +00:00
Eliot Jones
14599521f5 test the simple document merge in reverse order
check we can create the document in both merge orders.
2020-03-02 17:00:16 +00:00
Eliot Jones
2effedd3c5 add inheritable keys back into the copied pages node
keys suchs as resources, mediabox, cropbox, etc can be inherited. we now copy them if they are present on the parent pages node.
2020-03-02 17:00:16 +00:00
Eliot Jones
c596bef024 rename to resharper conventions and test 2nd page
renames fields to match the expected conventions for resharper. removes fully qualified names for using statements since resharper marks these as not-required.

adds a check to the pdf merger test to make sure the second page has the expected content. this is not currently valid since we lose the resources node on the pages tree.
2020-03-02 17:00:16 +00:00
Eliot Jones
2193063809 fix tests for public api and merge conflict
the cross reference parser tests behaviour had change, this fixes a compilation error from merge conflicts. also updates the merger tests to account for new version behaviour and checks the output document text. adds pdfmerger to the public api in the tests.
2020-03-02 17:00:16 +00:00
Inusual
b560c73fa9 Set the version as the highest version found in the merged document
And reorganize the code a little bit
2020-03-02 17:00:16 +00:00
Inusual
ca250a8c6f IsLenientParsing as false 2020-03-02 17:00:16 +00:00
Inusual
439a186ed7 Remove dead code 2020-03-02 17:00:16 +00:00
Inusual
932857cf8c Revert "Don't forget to copy /Pages properties too"
This reverts commit 224d9dc6e52245f9d16a22af460f386545403cd1.
2020-03-02 17:00:16 +00:00
Inusual
20ff625c2e Don't forget to copy /Pages properties too 2020-03-02 17:00:16 +00:00
Inusual
ec67ef76cd Remove ObjectsTree class since it's obsolete 2020-03-02 17:00:16 +00:00
Inusual
4ffc1620a1 Don't create unnecessary objects 2020-03-02 17:00:16 +00:00
Inusual
761e3d291e Rename DocumentBuilder to DocumentMerger
Remove IDisposable trait
2020-03-02 17:00:16 +00:00
Inusual
6e0caee317 Add StreamToken as exception to CopyTokenq
Add summary to better explain CopyToken purpose
2020-03-02 17:00:16 +00:00
Inusual
669742b6bd Fix for page object having as parent his own object
I decided to move to his own method, the part that create the page node. This allowed me to visualize better, from where I was suppose to get the correct reference.
2020-03-02 17:00:16 +00:00
Inusual
f485826751 PdfMerger: basic functionality implemented
I has a lot of unknown and TODO please look at them
2020-03-02 17:00:16 +00:00
Inusual
013cbd14e0 Make CrossReferenceTableParser a static class 2020-03-02 17:00:16 +00:00
Eliot Jones
b7a86f482f cache bounding boxes for composite fonts
cached the bounding box for a specific character code value for type 0 (composite) fonts to improve performance.
2020-02-28 16:36:06 +00:00
Eliot Jones
4442a69a97 use tryget rather than lambdas for union type
avoid the allocations caused by lambda expressions for performance reasons.
2020-02-28 16:02:20 +00:00
Eliot Jones
4d911fb9d1 use transform x for widths to improve performance
when transforming the advance width inside a font, we transform only the x coordinate rather than making a new point to transform.
2020-02-28 15:15:35 +00:00
BobLd
5ae38f1bad Improve default max function 2020-02-28 14:43:18 +00:00
Eliot Jones
c864fa512c remove islenientparsing from page classes 2020-02-28 11:50:18 +00:00
Eliot Jones
48d166276d remove islenientparsing from contentstreamprocessor 2020-02-28 11:44:13 +00:00
Eliot Jones
6fdaf054cb remove islenientparsing from annotation provider 2020-02-28 11:39:56 +00:00
Eliot Jones
7b09999a3f remove islenientparsing from the font handlers
we're removing islenientparsing to make the code simpler to maintain and use as well as more resilient.
2020-02-28 11:37:18 +00:00
Eliot Jones
746cbfa30c remove lenient parsing from font related classes
lenient parsing gives us more code to maintain for no real benefit, parsing should always be as lenient as possible. remove the flag from some of the font code.
2020-02-27 18:10:02 +00:00
Eliot Jones
ec9e425712 use length from stream dictionary if directly available
when brute forcing we use the length available in the stream's dictionary token if it is a direct number rather than an indirect reference.
2020-02-27 17:17:49 +00:00
Eliot Jones
420daaac6e handle unbalanced parentheses for string tokenization
when the close parenthesis is unbalanced and precedes a line break followed by '/' or '>' we assume the bracket to be unbalanced and finish reading the string.
2020-02-27 17:01:15 +00:00
Eliot Jones
f7cabe5d12 ignore invalid postscript format type truetype
when reading the format type of a postscript table in a truetype font ignore an invalid format value.
2020-02-27 16:10:19 +00:00
Eliot Jones
c033a0b3fe handle missing end bytes for cmap code
when the byte input does not contain a code of the full code length we return early.
2020-02-27 15:26:48 +00:00
Eliot Jones
25b6f1d46d change set version script to use correct path
rather than directory relative to the working directory use the path relative to the tools folder.
2020-02-27 13:29:15 +00:00
Eliot Jones
f2d94413dc create script to increment project versions
each time we want to up the version number of the nuget package it involves opening every csproj and manually updating the version. this script updates the version for all projects, except the test project, in the 'src' folder.
2020-02-27 13:24:26 +00:00
Eliot Jones
0fcc4e54c8 add istestproject setting to all projects
indicates which projects are test projects to the test runner.
2020-02-27 12:35:40 +00:00