Commit Graph

970 Commits

Author SHA1 Message Date
Eliot Jones
746cbfa30c remove lenient parsing from font related classes
lenient parsing gives us more code to maintain for no real benefit, parsing should always be as lenient as possible. remove the flag from some of the font code.
2020-02-27 18:10:02 +00:00
Eliot Jones
ec9e425712 use length from stream dictionary if directly available
when brute forcing we use the length available in the stream's dictionary token if it is a direct number rather than an indirect reference.
2020-02-27 17:17:49 +00:00
Eliot Jones
420daaac6e handle unbalanced parentheses for string tokenization
when the close parenthesis is unbalanced and precedes a line break followed by '/' or '>' we assume the bracket to be unbalanced and finish reading the string.
2020-02-27 17:01:15 +00:00
Eliot Jones
f7cabe5d12 ignore invalid postscript format type truetype
when reading the format type of a postscript table in a truetype font ignore an invalid format value.
2020-02-27 16:10:19 +00:00
Eliot Jones
c033a0b3fe handle missing end bytes for cmap code
when the byte input does not contain a code of the full code length we return early.
2020-02-27 15:26:48 +00:00
Eliot Jones
25b6f1d46d change set version script to use correct path
rather than directory relative to the working directory use the path relative to the tools folder.
2020-02-27 13:29:15 +00:00
Eliot Jones
f2d94413dc create script to increment project versions
each time we want to up the version number of the nuget package it involves opening every csproj and manually updating the version. this script updates the version for all projects, except the test project, in the 'src' folder.
2020-02-27 13:24:26 +00:00
Eliot Jones
0fcc4e54c8 add istestproject setting to all projects
indicates which projects are test projects to the test runner.
2020-02-27 12:35:40 +00:00
Eliot Jones
4150881be9 recover from invalid acro-form references
we add a try/catch to the direct object finder's tryget method so it returns false rather than throwing.

if we have an acro-form reference in the catalog but no corresponding object in the document we instead scan all objects in the document to find form fields and reconstruct the acro-form dictionary.
2020-02-27 12:08:40 +00:00
Eliot Jones
f415c3116e cross reference offset is in the xref table we ignore the error
previously we checked the offset was not inside the table (correct thing to check), however this is only a special case of the more general issue (cross reference offsets are wrong). we move handling for this into the pdf token scanner. if we attempt to read an object at an offset and it fails we brute force the entire file to find correct offsets. we also needed to add handling to make sure we don't attempt to use stream length tokens if we're brute-forcing since we can't look up indirect references for length.
2020-02-26 14:03:46 +00:00
Eliot Jones
7d0d5806a9 fix reverse xref location search
when brute force searching for the start of the cross-reference table (xref) we read 5 byte buffers, previously if the 'x' of 'xref' was the first character of the buffer we skipped it. this checks when 'x' is the first character of the buffer.
2020-02-26 12:55:11 +00:00
Eliot Jones
f07e2dfb84 more tolerant handling of endimage recovery
fixes the recorded offset when an endimage is recovered from the first time. it was off by one so if the subsequent end image was also the wrong tag then the second attempt at recovery failed.

also allows recovery when other tags appear after an endimage as long as they're not block ending operations (end text, perhaps pop/push in future).
2020-02-26 12:41:39 +00:00
Eliot Jones
43afac8f5d default to zero width characters in truetype for '.notdef'
when the character is not defined and the corresponding '.notdef' glyph isn't included in a truetype font we now default to a zero width character. it might be that we should use the default/missing width instead but this will work ok for most use-cases.
2020-02-26 12:39:12 +00:00
Eliot Jones
50c17f7951 make compact font format parser thread safe
the individual cff parser uses a cff dictionary reader inside it which has a per-instance operands list, for this reason it is not thread-safe and cannot be shared. this change creates a new individual font parser for each call to the top-level cff parser.
2020-02-25 14:24:29 +00:00
Eliot Jones
9f488809ac #141 cast adobe type 2 char string value to short
where the value is 28 the next two bytes indicate a short, not a 16 bit two's complement number, apparently, or i've misunderstood what the two's complement bit is about...
2020-02-25 13:56:26 +00:00
Eliot Jones
486ea446c5 #141 divide width by 1000 for adobe type 1 font
the width shouldn't be transformed by the font's matrix, instead the width is divided by 1000 by default.
2020-02-25 13:44:15 +00:00
Eliot Jones
f7c6de4118 #141 fix two's complement in adobe type 2 charstring
the byte value of 28 indicates the next 2 bytes are a 16 bit two's complement number rather than just a short. this changes the calculation to generate the two's complement value correctly.
2020-02-25 13:19:47 +00:00
Eliot Jones
d6d3869fe2 fix brute force searcher offsets
the brute force searcher offsets were off by one. this change means the offset returned is now aligned with the object number in the object number/generation/operator triple.
2020-02-24 12:24:18 +00:00
Eliot Jones
7ac3fb2a39 remove old clipping rule code and make enum byte
removes the unused set winding rule method and makes the clipping rule enum a byte which will save 3 bytes per pdfpath instance.
2020-02-24 11:29:06 +00:00
BobLd
0afaa19d15 Handle null CurrentPath 2020-02-24 11:20:56 +00:00
BobLd
74413fa6a4 Add PdfPath.IsCounterClockwise tests 2020-02-24 11:20:56 +00:00
BobLd
1d095af974 Implement Modify Clipping operations 2020-02-24 11:20:56 +00:00
BobLd
b0eaccf56f Add PdfRectangle.Contains(PdfRectangle) and tests 2020-02-23 14:44:35 +00:00
BobLd
8e5aaf068d use MemberData for all Geometry tests 2020-02-23 11:23:27 +00:00
BobLd
bbdb778d5e use Count == 0 i.o. !Any() 2020-02-23 11:23:27 +00:00
BobLd
514a5eff20 use MemberData in PdfRectangle.Area tests 2020-02-23 11:23:27 +00:00
BobLd
41c8af0de4 use MemberData in PdfRectangle.Rotate tests 2020-02-23 11:23:27 +00:00
BobLd
c6669679d6 use MemberData in Line.Length tests 2020-02-23 11:23:27 +00:00
BobLd
1b15238e31 avoid creating PdfLines in IntersectsWith 2020-02-23 11:23:27 +00:00
BobLd
49caa071ba improve length computation
tidy up IntersectsWith()
2020-02-23 11:23:27 +00:00
BobLd
7d7a7fc5ee remove useless code 2020-02-23 11:23:27 +00:00
BobLd
42245d70ca Improve PdfRectangle.GetWidthHeight();
Improve and simplify Word's oriented bounding box
2020-02-23 11:23:27 +00:00
BobLd
67c5abf2b6 fix Intersect(BezierCurve, Line) and add tests 2020-02-23 11:23:27 +00:00
BobLd
2878e74017 Add bezier curve test 2020-02-23 11:23:27 +00:00
BobLd
36566f42e6 Create generic methods for lines in GeometryExtensions 2020-02-23 11:23:27 +00:00
BobLd
6fe0ef0351 Add line tests and improve GeometryExtensions 2020-02-23 11:23:27 +00:00
BobLd
b8d1eba8ee PdfLine.Insersect() 2020-02-23 11:23:27 +00:00
BobLd
d2ac2f598a update Centroid, GetWidthHeight and tests 2020-02-23 11:23:27 +00:00
BobLd
2c8c6cda87 add GeometryExtensions tests 2020-02-23 11:23:27 +00:00
Eliot Jones
c6dc4d9eb8 handle tokenizing invalid numeric string correctly
rather than throwing when an invalid numeric string is read, our tokenizer now returns false so that error recovery methods can be attempted.
2020-02-21 11:16:31 +00:00
Eliot Jones
8d415fd162 use type 1 font handler for mmtype1
multiple master fonts are an extension of the adobe type 1 font format. we don't have any special case handling for them so for now we default to attempting to use the adobe type 1 font handler. it may be that we need some special parsing logic but the test file using the mmtype1 fonts didn't actually specify any font bytes so we can't check.
2020-02-21 10:49:29 +00:00
Eliot Jones
28faf1c22c default to .notdef for type 2 charstrings
if the glyph with a specific name isn't found in the set of type 2 charstrings we default to using the .notdef glyph if present.
2020-02-21 10:37:58 +00:00
Eliot Jones
c635b919d2 make numbers culture invariant in document builder
decimal numbers were dependent on the current thread culture for the output file. this meant values like '70.679' were output as '70,679' for cultures using a comma rather than period separator for the floating point (i.e. the whole world). this resulted in the file displaying incorrectly.
2020-02-20 13:06:12 +00:00
BobLd
848d687910 Add kd tree and improve clustering 2020-02-19 11:12:54 +00:00
Eliot Jones
ed894baffc Merge pull request #135 from BobLd/rxycut-stackoverflow
Fix RecursiveXYCut
2020-02-15 10:52:20 +00:00
BobLd
3229574345 Correcting typo 2020-02-12 12:00:14 +00:00
BobLd
6a72ce655c Merge branch 'master' of https://github.com/UglyToad/PdfPig into rxycut-stackoverflow 2020-02-12 11:40:51 +00:00
BobLd
f0be3e99ee Add Projection class 2020-02-11 10:04:04 +00:00
BobLd
dd8baa0d2f Fix stack overflow error in RecursiveXYCut
Was linked to the new rotated rectangle methods
2020-02-10 13:53:59 +00:00
BobLd
b49740d093 add test for index increment when children are present 2020-02-10 11:23:19 +00:00