PdfPig

lsm/PdfPig

mirror of https://github.com/UglyToad/PdfPig.git synced 2025-10-15 03:34:52 +08:00

Author	SHA1	Message	Date
Eliot Jones	19462d79f0	add support for jpeg images in pdf document builder since jpegs can be trivially embedded in pdf documents without changes to the data stream this is the first image format we will support. currently this is a naive approach which doesn't share an image resources between pages. ideally we will either de-duplicated images when added, return a re-usable key once an image is added, or both.	2020-03-16 19:32:57 +00:00
Eliot Jones	908d84ccc6	remove debug code from the test i accidentally left this in when debugging the pdf merging.	2020-03-15 16:29:18 +00:00
BobLd	c1a1fa1f7f	- fix minimum area rectangle algo and make it public - add tests - tidy up code	2020-03-15 17:20:36 +01:00
InusualZ	26f92a9630	Add test case, that test that we lower the object count	2020-03-15 16:05:23 +01:00
Eliot Jones	24c5cbea4b	support custom page sizes for document builder #147 page size custom is not supported for the document builder so a new overload which supports user defined page sizes is provided.	2020-03-07 16:48:19 +00:00
InusualZ	ab3779e644	Fix an issue where the root`/Pages` Count was incorrect `/Pages` Count should reflect the number leaf nodes (page objects) that are descendants of this node.	2020-03-07 16:47:35 +00:00
Eliot Jones	bef68a0654	avoid infinite loop in brute-force searcher #88 fixes the case where the brute-force searcher becomes stuck in an infinite loop, it may be the case that the problem pdf from #88 has a newline or some other whitespace between its object and generation number so this may cause a failure elsewhere.	2020-03-03 15:49:17 +00:00
Eliot Jones	58972de7cb	begin to rework cross-reference parsing most of the cross-reference code is the earliest code in the project and hasn't been revisited since then. the issue #88 has been reopened due to a bug with brute-force searching so this tidies up the code in this area ahead of trying to fix the bug.	2020-03-03 15:21:11 +00:00
Eliot Jones	4b5c8d510e	add test for comment in dictionary from #145 check that we correctly handle the case where a comment appears inside a dictionary, this was handled by commit 3084a9. use list internally to dictionary tokenizer to avoid interface performance penalties.	2020-03-03 11:36:01 +00:00
Eliot Jones	14599521f5	test the simple document merge in reverse order check we can create the document in both merge orders.	2020-03-02 17:00:16 +00:00
Eliot Jones	c596bef024	rename to resharper conventions and test 2nd page renames fields to match the expected conventions for resharper. removes fully qualified names for using statements since resharper marks these as not-required. adds a check to the pdf merger test to make sure the second page has the expected content. this is not currently valid since we lose the resources node on the pages tree.	2020-03-02 17:00:16 +00:00
Eliot Jones	2193063809	fix tests for public api and merge conflict the cross reference parser tests behaviour had change, this fixes a compilation error from merge conflicts. also updates the merger tests to account for new version behaviour and checks the output document text. adds pdfmerger to the public api in the tests.	2020-03-02 17:00:16 +00:00
Inusual	669742b6bd	Fix for page object having as parent his own object I decided to move to his own method, the part that create the page node. This allowed me to visualize better, from where I was suppose to get the correct reference.	2020-03-02 17:00:16 +00:00
Inusual	f485826751	PdfMerger: basic functionality implemented I has a lot of unknown and TODO please look at them	2020-03-02 17:00:16 +00:00
Inusual	013cbd14e0	Make CrossReferenceTableParser a static class	2020-03-02 17:00:16 +00:00
Eliot Jones	7b09999a3f	remove islenientparsing from the font handlers we're removing islenientparsing to make the code simpler to maintain and use as well as more resilient.	2020-02-28 11:37:18 +00:00
Eliot Jones	746cbfa30c	remove lenient parsing from font related classes lenient parsing gives us more code to maintain for no real benefit, parsing should always be as lenient as possible. remove the flag from some of the font code.	2020-02-27 18:10:02 +00:00
Eliot Jones	420daaac6e	handle unbalanced parentheses for string tokenization when the close parenthesis is unbalanced and precedes a line break followed by '/' or '>' we assume the bracket to be unbalanced and finish reading the string.	2020-02-27 17:01:15 +00:00
Eliot Jones	0fcc4e54c8	add istestproject setting to all projects indicates which projects are test projects to the test runner.	2020-02-27 12:35:40 +00:00
Eliot Jones	f415c3116e	cross reference offset is in the xref table we ignore the error previously we checked the offset was not inside the table (correct thing to check), however this is only a special case of the more general issue (cross reference offsets are wrong). we move handling for this into the pdf token scanner. if we attempt to read an object at an offset and it fails we brute force the entire file to find correct offsets. we also needed to add handling to make sure we don't attempt to use stream length tokens if we're brute-forcing since we can't look up indirect references for length.	2020-02-26 14:03:46 +00:00
Eliot Jones	d6d3869fe2	fix brute force searcher offsets the brute force searcher offsets were off by one. this change means the offset returned is now aligned with the object number in the object number/generation/operator triple.	2020-02-24 12:24:18 +00:00
BobLd	74413fa6a4	Add PdfPath.IsCounterClockwise tests	2020-02-24 11:20:56 +00:00
BobLd	1d095af974	Implement Modify Clipping operations	2020-02-24 11:20:56 +00:00
BobLd	b0eaccf56f	Add PdfRectangle.Contains(PdfRectangle) and tests	2020-02-23 14:44:35 +00:00
BobLd	8e5aaf068d	use MemberData for all Geometry tests	2020-02-23 11:23:27 +00:00
BobLd	514a5eff20	use MemberData in PdfRectangle.Area tests	2020-02-23 11:23:27 +00:00
BobLd	41c8af0de4	use MemberData in PdfRectangle.Rotate tests	2020-02-23 11:23:27 +00:00
BobLd	c6669679d6	use MemberData in Line.Length tests	2020-02-23 11:23:27 +00:00
BobLd	49caa071ba	improve length computation tidy up IntersectsWith()	2020-02-23 11:23:27 +00:00
BobLd	42245d70ca	Improve PdfRectangle.GetWidthHeight(); Improve and simplify Word's oriented bounding box	2020-02-23 11:23:27 +00:00
BobLd	67c5abf2b6	fix Intersect(BezierCurve, Line) and add tests	2020-02-23 11:23:27 +00:00
BobLd	2878e74017	Add bezier curve test	2020-02-23 11:23:27 +00:00
BobLd	36566f42e6	Create generic methods for lines in GeometryExtensions	2020-02-23 11:23:27 +00:00
BobLd	6fe0ef0351	Add line tests and improve GeometryExtensions	2020-02-23 11:23:27 +00:00
BobLd	b8d1eba8ee	PdfLine.Insersect()	2020-02-23 11:23:27 +00:00
BobLd	d2ac2f598a	update Centroid, GetWidthHeight and tests	2020-02-23 11:23:27 +00:00
BobLd	2c8c6cda87	add GeometryExtensions tests	2020-02-23 11:23:27 +00:00
BobLd	b49740d093	add test for index increment when children are present	2020-02-10 11:23:19 +00:00
BobLd	ec68231ab3	remove useless using	2020-02-10 11:23:19 +00:00
BobLd	ac1e2c49ba	Fix bounding box for artifact Add tests	2020-02-10 11:23:19 +00:00
BobLd	1a11e49124	Add TransformationMatrix.Inverse() tests	2020-02-09 14:20:47 +00:00
Eliot Jones	0e84fa34a8	fix usages of rectangle constructor. now that rectangle constructor uses the order [ llx, lly, urx, ury ] and does not apply correction for points constructor parameters must be passed in the correct order. this change fixes the hyperlink factory which was passing them in the wrong order. in addition the pdfpath bounding box was using left, right, top and bottom to calculate the minimum bounding box. this produced incorrect values now individual path operator bounding boxes are rotated, since for a rotated rectangle top may be less than bottom. the performance seems to have taken a hit due to these changes however.	2020-01-31 16:24:59 +00:00
Eliot Jones	693a3d5958	use offset to file header to correct cross references if the %pdf version header comment is offset from the start of the file the cross reference offsets will also be wrong by that amount. this change updates the cross reference location logic to use the offset from the located version header.	2020-01-26 15:30:20 +00:00
Eliot Jones	a561c8954e	handle the format header being preceded by nonsense some files seem to have the format header preceded by large amounts of junk but this appears to be valid for chrome and acrobat reader. this change ups the amount of nonsense to be read prior to the version header. also makes parsing of the version header culture invariant which may be related to #85.	2020-01-25 16:53:41 +00:00
Eliot Jones	ba09a13d08	more end image recovery logic since inline image data may contain the end image "ei" token inside the data stream there's no reliable way to actually determine if we've read all the data. for this reason if we end up with an invalid state parsing operations after we've read the end image token we try to recover by reading from the previous token to the next end image token if any. we supply log information to let the consumer know this is what we're doing. it's still not bullet-proof but it should be good enough. also support negative page rotation values by adding them to a 360 degree rotation so -90 degrees clockwise is 270 degrees clockwise.	2020-01-25 15:53:08 +00:00
Eliot Jones	3243be3ec5	change rectangle drawing logic for tests support rotated output rectangles in the visual verification tests.	2020-01-22 13:45:52 +00:00
Eliot Jones	0ed4e58556	add test cases for rectangle transforms our bounding rectangle values still seem to be wrong for rotated letters. this change adds some test cases for common transformation matrix operations on a rectangle, scale, translate and rotate.	2020-01-22 13:28:47 +00:00
Eliot Jones	b50f476c31	update local tests we set the file type filter to only pick up pdfs.	2020-01-14 14:59:14 +00:00
Eliot Jones	d011f37316	merge master	2020-01-09 15:32:10 +00:00
Eliot Jones	43574097f1	rename marked content elements and use factory since the properties in marked content may be indirect references or belong to the page resources array, the value should be calculated during content processing. this change tidies up the marked content classes so they do not expose mutable data and uses the pdf token scanner overloads to load dictionary data.	2020-01-09 15:30:16 +00:00

1 2 3 4 5 ...

333 Commits