PdfPig

lsm/PdfPig

mirror of https://github.com/UglyToad/PdfPig.git synced 2025-10-15 03:34:52 +08:00

Author	SHA1	Message	Date
Inusual	013cbd14e0	Make CrossReferenceTableParser a static class	2020-03-02 17:00:16 +00:00
Eliot Jones	b7a86f482f	cache bounding boxes for composite fonts cached the bounding box for a specific character code value for type 0 (composite) fonts to improve performance.	2020-02-28 16:36:06 +00:00
Eliot Jones	4442a69a97	use tryget rather than lambdas for union type avoid the allocations caused by lambda expressions for performance reasons.	2020-02-28 16:02:20 +00:00
Eliot Jones	4d911fb9d1	use transform x for widths to improve performance when transforming the advance width inside a font, we transform only the x coordinate rather than making a new point to transform.	2020-02-28 15:15:35 +00:00
Eliot Jones	c864fa512c	remove islenientparsing from page classes	2020-02-28 11:50:18 +00:00
Eliot Jones	48d166276d	remove islenientparsing from contentstreamprocessor	2020-02-28 11:44:13 +00:00
Eliot Jones	6fdaf054cb	remove islenientparsing from annotation provider	2020-02-28 11:39:56 +00:00
Eliot Jones	7b09999a3f	remove islenientparsing from the font handlers we're removing islenientparsing to make the code simpler to maintain and use as well as more resilient.	2020-02-28 11:37:18 +00:00
Eliot Jones	746cbfa30c	remove lenient parsing from font related classes lenient parsing gives us more code to maintain for no real benefit, parsing should always be as lenient as possible. remove the flag from some of the font code.	2020-02-27 18:10:02 +00:00
Eliot Jones	ec9e425712	use length from stream dictionary if directly available when brute forcing we use the length available in the stream's dictionary token if it is a direct number rather than an indirect reference.	2020-02-27 17:17:49 +00:00
Eliot Jones	c033a0b3fe	handle missing end bytes for cmap code when the byte input does not contain a code of the full code length we return early.	2020-02-27 15:26:48 +00:00
Eliot Jones	0fcc4e54c8	add istestproject setting to all projects indicates which projects are test projects to the test runner.	2020-02-27 12:35:40 +00:00
Eliot Jones	4150881be9	recover from invalid acro-form references we add a try/catch to the direct object finder's tryget method so it returns false rather than throwing. if we have an acro-form reference in the catalog but no corresponding object in the document we instead scan all objects in the document to find form fields and reconstruct the acro-form dictionary.	2020-02-27 12:08:40 +00:00
Eliot Jones	f415c3116e	cross reference offset is in the xref table we ignore the error previously we checked the offset was not inside the table (correct thing to check), however this is only a special case of the more general issue (cross reference offsets are wrong). we move handling for this into the pdf token scanner. if we attempt to read an object at an offset and it fails we brute force the entire file to find correct offsets. we also needed to add handling to make sure we don't attempt to use stream length tokens if we're brute-forcing since we can't look up indirect references for length.	2020-02-26 14:03:46 +00:00
Eliot Jones	7d0d5806a9	fix reverse xref location search when brute force searching for the start of the cross-reference table (xref) we read 5 byte buffers, previously if the 'x' of 'xref' was the first character of the buffer we skipped it. this checks when 'x' is the first character of the buffer.	2020-02-26 12:55:11 +00:00
Eliot Jones	f07e2dfb84	more tolerant handling of endimage recovery fixes the recorded offset when an endimage is recovered from the first time. it was off by one so if the subsequent end image was also the wrong tag then the second attempt at recovery failed. also allows recovery when other tags appear after an endimage as long as they're not block ending operations (end text, perhaps pop/push in future).	2020-02-26 12:41:39 +00:00
Eliot Jones	43afac8f5d	default to zero width characters in truetype for '.notdef' when the character is not defined and the corresponding '.notdef' glyph isn't included in a truetype font we now default to a zero width character. it might be that we should use the default/missing width instead but this will work ok for most use-cases.	2020-02-26 12:39:12 +00:00
Eliot Jones	486ea446c5	#141 divide width by 1000 for adobe type 1 font the width shouldn't be transformed by the font's matrix, instead the width is divided by 1000 by default.	2020-02-25 13:44:15 +00:00
Eliot Jones	d6d3869fe2	fix brute force searcher offsets the brute force searcher offsets were off by one. this change means the offset returned is now aligned with the object number in the object number/generation/operator triple.	2020-02-24 12:24:18 +00:00
BobLd	0afaa19d15	Handle null CurrentPath	2020-02-24 11:20:56 +00:00
BobLd	1d095af974	Implement Modify Clipping operations	2020-02-24 11:20:56 +00:00
BobLd	b0eaccf56f	Add PdfRectangle.Contains(PdfRectangle) and tests	2020-02-23 14:44:35 +00:00
BobLd	bbdb778d5e	use Count == 0 i.o. !Any()	2020-02-23 11:23:27 +00:00
BobLd	c6669679d6	use MemberData in Line.Length tests	2020-02-23 11:23:27 +00:00
BobLd	1b15238e31	avoid creating PdfLines in IntersectsWith	2020-02-23 11:23:27 +00:00
BobLd	49caa071ba	improve length computation tidy up IntersectsWith()	2020-02-23 11:23:27 +00:00
BobLd	7d7a7fc5ee	remove useless code	2020-02-23 11:23:27 +00:00
BobLd	42245d70ca	Improve PdfRectangle.GetWidthHeight(); Improve and simplify Word's oriented bounding box	2020-02-23 11:23:27 +00:00
BobLd	67c5abf2b6	fix Intersect(BezierCurve, Line) and add tests	2020-02-23 11:23:27 +00:00
BobLd	2878e74017	Add bezier curve test	2020-02-23 11:23:27 +00:00
BobLd	36566f42e6	Create generic methods for lines in GeometryExtensions	2020-02-23 11:23:27 +00:00
BobLd	6fe0ef0351	Add line tests and improve GeometryExtensions	2020-02-23 11:23:27 +00:00
BobLd	b8d1eba8ee	PdfLine.Insersect()	2020-02-23 11:23:27 +00:00
BobLd	d2ac2f598a	update Centroid, GetWidthHeight and tests	2020-02-23 11:23:27 +00:00
BobLd	2c8c6cda87	add GeometryExtensions tests	2020-02-23 11:23:27 +00:00
Eliot Jones	8d415fd162	use type 1 font handler for mmtype1 multiple master fonts are an extension of the adobe type 1 font format. we don't have any special case handling for them so for now we default to attempting to use the adobe type 1 font handler. it may be that we need some special parsing logic but the test file using the mmtype1 fonts didn't actually specify any font bytes so we can't check.	2020-02-21 10:49:29 +00:00
Eliot Jones	c635b919d2	make numbers culture invariant in document builder decimal numbers were dependent on the current thread culture for the output file. this meant values like '70.679' were output as '70,679' for cultures using a comma rather than period separator for the floating point (i.e. the whole world). this resulted in the file displaying incorrectly.	2020-02-20 13:06:12 +00:00
BobLd	dd8baa0d2f	Fix stack overflow error in RecursiveXYCut Was linked to the new rotated rectangle methods	2020-02-10 13:53:59 +00:00
BobLd	905559e282	Fix bounding box for artifact Add tests	2020-02-09 17:46:35 +00:00
BobLd	635693f032	Fix #133 Marked content extraction issue	2020-02-09 15:23:55 +00:00
BobLd	df73206788	try rerun failed tests	2020-02-09 14:20:47 +00:00
BobLd	6a86cdab73	make GeometryExtensions.OrientedBoundingBox() internal	2020-02-09 14:20:47 +00:00
BobLd	288beab39d	limit access to array and add comments	2020-02-09 14:20:47 +00:00
BobLd	7364e53bb9	Improve bounding box for word	2020-02-09 14:20:47 +00:00
Eliot Jones	f3fcd1b3a1	ignore form dictionaries that do not contain fields #131 though a form dictionary should always contain fields (as required by the spec) it is possible for this entry to be missing. in this case we return false for trygetform.	2020-02-05 10:56:01 +00:00
Eliot Jones	40dc80c281	handle type 1 font with no descriptor information #132 though required by the spec an adobe type 1 font may be missing all width data. in this case we default to empty values and treat it like a normal adobe type 1 font.	2020-02-05 10:46:39 +00:00
Eliot Jones	0e84fa34a8	fix usages of rectangle constructor. now that rectangle constructor uses the order [ llx, lly, urx, ury ] and does not apply correction for points constructor parameters must be passed in the correct order. this change fixes the hyperlink factory which was passing them in the wrong order. in addition the pdfpath bounding box was using left, right, top and bottom to calculate the minimum bounding box. this produced incorrect values now individual path operator bounding boxes are rotated, since for a rotated rectangle top may be less than bottom. the performance seems to have taken a hit due to these changes however.	2020-01-31 16:24:59 +00:00
BobLd	0e613fb526	Handle cases with not enough points in minimal bounding rectangle	2020-01-31 16:24:59 +00:00
BobLd	4c65cbc139	Improve minimum bounding box orientation v2	2020-01-31 16:24:59 +00:00
BobLd	bff18d81ca	Improve minimum bounding box orientation	2020-01-31 16:24:59 +00:00

1 2 3 4 5 ...

511 Commits