PdfPig

lsm/PdfPig

mirror of https://github.com/UglyToad/PdfPig.git synced 2025-09-19 19:07:56 +08:00

Author	SHA1	Message	Date
mvantzet	9ff095c516	Fix typo	2023-03-13 18:15:33 +01:00
mvantzet	ea77156eb8	Changes for annotation positions: - Pass in the initial matrix to the annotation provider, so that it can return the correct rectangles / quad points. - Made a change / extensions to the Annotation class: - ModifiedDate is now a DateTimeOffset instead of unparsed string. If the string is invalid, ModifiedDate is set to the default value. - Added lookup for the "appearance streams"; all the annotations should have a "N" (normal) appearance, and optionally have a "R" (roll-over/hover) and "D" (down/click) appearance. Did not expose the actual stream objects, but added a flag indicating the existence of "R" / "D". At some point we can consider doing something with the appearances. - Changed signature of GetInitialMatrix / ContentStreamProcessor constructor from PdfRectangle back to what it was earlier, namely MediaBox and CropBox, to prevent accidentally mixing the two up in the caller.	2023-03-13 18:15:24 +01:00
mvantzet	a439b43246	Added integration test for cropped document, and a cropped+rotated document with an annotation as well. Added annotations to visual verification test (blue outlines).	2023-03-13 18:08:20 +01:00
mvantzet	17681472cc	Also apply optimizations (set sin/cos to integers) for e.g. -270 degrees.	2023-03-13 17:50:58 +01:00
mvantzet	0413f3f1bf	Fix related to page sizes / rotation / coordinate transformations (issue 560): The initial transformation matrix was incorrect, as it translated by the cropbox width/height instead of by the cropbox left/bottom offsets. Also, it did not translate the results back into the 1st quadrant so that (0,0) would (again) be the lower left corner origin for the cropped area. Added unit tests in new file ContentStreamProcessorTests. EFFECTIVE CHANGES: - The coordinates used for letters etc. are different now for rotated and/or cropped pages, but as those were not very consistent anyway this is probably OK. - The Page Size (A4, A3, Custom, etc.), Width and Height are now determined by the CropBox, not by the MediaBox; the CropBox ultimately determines what you see on screen and is printable. If no cropbox is defined in the PDF, it is set to the MediaBox; so in that case it is backwards compatible with the old code. - The Page MediaBox and CropBox properties are no longer rotated according to Page.Rotation. The Page Width and Height do take rotation into account (kept it backward compatible).	2023-03-09 16:42:09 +01:00
mvantzet	3a0a6e1411	Resolving page sizes did not work when the page orientation was landscape, or when side lengths were not integer or off by one. Added unit tests.	2023-03-09 16:09:14 +01:00
Eliot Jones	999f9ee7dc	Merge pull request #551 from fnatzke/Issue549 Fix for Issue#549. Skip over Jpg segments rather than use bruce force…	2023-02-18 10:50:44 +00:00
Eliot Jones	761bce8591	Merge pull request #539 from mvantzet/ExtendLetterProperties Added Letter properties RenderingMode, StrokeColor, FillColor and add…	2023-02-18 10:48:47 +00:00
Fred Natzke	3a848c090c	Fix for Issue#549. Skip over Jpg segments rather than use bruce force to find segment start marker	2023-02-16 15:55:42 +10:00
Eliot Jones	88aaddcf26	Merge pull request #541 from fnatzke/Fix540 Fix 540 Copy page with inline image.	2023-01-16 15:25:35 -05:00
Fred Natzke	324de1da67	Fix 540 Copy page with inline image.	2023-01-16 14:27:04 +10:00
mvantzet	2acca32987	Added integration test to see if we can detect the presence of invisible text (text rendering mode = Neither), visible text, the presence of images and the presence of paths. Certain combinations thereof potentially must be run through OCR.	2023-01-13 14:11:13 +01:00
mvantzet	06253966e4	Added Letter properties RenderingMode, StrokeColor, FillColor and added those as mandatory constructor arguments. Kept property Color, which contains either StrokeColor (if rendering mode is Stroke) or FillColor (for all other rendering modes). In PdfPageBuilder opted for default text rendering mode "Fill" which seems like a sensible default.	2023-01-13 12:35:25 +01:00
Eliot Jones	65bc754f5b	remove ci unsupported syntax	2023-01-08 15:38:05 -05:00
Eliot Jones	d5b196ff44	remove unsupported syntax	2023-01-08 14:18:09 -05:00
Eliot Jones	d2944e14e5	change alpha version for nightly builds	2023-01-08 12:01:48 -05:00
Eliot Jones	57e9acbc12	post merge tidy up	2023-01-08 12:00:35 -05:00
Eliot Jones	7b891edb69	Merge pull request #526 from fnatzke/Issue455-Issue_extracting_unicode_from_CJK_file Fix #455 extracting unicode from CJK file	2023-01-08 11:53:32 -05:00
Eliot Jones	37e31c40ae	Merge pull request #522 from fnatzke/master Fix #514 Print Character with ZapfDingbats font	2023-01-08 11:52:15 -05:00
Eliot Jones	982f36647a	Merge pull request #525 from mvantzet/ITokenWriter Make TokenWriter non-static, implement ITokenWriter, injection in PdfDocumentBuilder, add PdfTextRemover	2023-01-08 11:49:56 -05:00
Fred Natzke	210c0dde50	Issue451_Type2CharStrings parsing/intepretation error	2022-12-26 17:08:03 +10:00
Fred Natzke	8b32a4d958	TYpe0Font better description of conversion to unicode.	2022-12-24 10:53:55 +10:00
Fred Natzke	4ba2a29aa0	Fix #455 extracting unicode from CJK file	2022-12-23 16:53:37 +10:00
mvantzet	94c62e1b65	Forgot to commit this updated test, to allow ITokenWriter and PdfTextRemover to be public.	2022-12-21 09:35:29 +01:00
mvantzet	371e148c63	Remove unreachable code	2022-12-20 21:31:35 +01:00
mvantzet	6ef6c4d780	Added a PdfTextRemover utility that uses a NoTextTokenWriter, to output PDFs without text contents. Also added unit tests to test: - If we can use a custom ITokenWriter with PdfDocumentBuilder - If removing text works.	2022-12-20 21:31:15 +01:00
mvantzet	9273a43965	Merge branch 'master' of https://github.com/mvantzet/PdfPig into ITokenWriter	2022-12-20 11:08:13 +01:00
mvantzet	6125c00089	Make it possible to inject a custom ITokenWriter in PdfDocumentBuilder.	2022-12-20 10:50:41 +01:00
Fred Natzke	2bcac59917	Minor cleanup of some texts.	2022-12-20 14:22:37 +10:00
Fred Natzke	7b441a3b0a	Standard14WritingFontTests Add tests of invalid characters for each standard font	2022-12-20 14:17:02 +10:00
Fred Natzke	8f13e2f11c	Fix #375 Add checkmark from ZapfDingbats using unicode character. Letters extracted have unicode values.	2022-12-17 17:39:15 +10:00
mvantzet	3594231c67	Introduce ITokenWriter / non-static TokenWriter class. This is the first step in making it possible to override methods in the token writer, for example to filter streams when writing using PdfDocumentBuilder. The second step is injecting ITokenWriter into PdfDocumentBuilder.	2022-12-15 18:01:10 +01:00
Fred Natzke	620fa9b8cc	Fix #514 Print Character with ZapfDingbats font	2022-12-13 14:22:30 +10:00
Eliot Jones	c8874c5984	#483 make skip missing fonts even more resilient to nonsense files v0.1.7	2022-12-11 16:18:09 -05:00
Eliot Jones	2aed996319	Merge pull request #517 from fnatzke/master Fixes for Issue#512, 516 and 519	2022-12-09 09:39:55 -05:00
Eliot Jones	060c7bc728	Merge pull request #521 from eliotjones-roger/rotation-support-for-page-builder add ability to rotate page by number of degrees, make builder fluent	2022-12-09 09:08:27 -05:00
Eliot Jones	6764d81958	Merge pull request #520 from mjolivet-lucca/master Adding non regression unit test to cover PR 473	2022-12-09 08:56:22 -05:00
Eliot Jones	95df15996b	add ability to rotate page by number of degrees, make builder fluent	2022-12-09 08:44:56 -05:00
Mathieu jolivet	998e768bba	[ADD] adding non regression unit test to cover PR 473	2022-12-08 14:38:19 +01:00
Fred Natzke	f5fe39b285	Issue 512 revisited. Use of hashset to avoid reprocessing same token in infinite loop.	2022-12-07 17:27:34 +10:00
Fred Natzke	9ef07b0176	Fix Issue 519 Break infinite loop parsing kid token list containing parent	2022-12-06 16:29:09 +10:00
Fred Natzke	29adece983	Original #516 fix causes Tests to fail. Alternative to use pagesByNumber?.Count in Catalog for discovered pages. Some other unrelated tests failed due to source using single rather than two character newlines. Changed to test string itself for '\r' rather than environment is Unix	2022-11-30 11:10:52 +10:00
Fred Natzke	a7f64ec64b	Fix Issue 516. Page Dictionary Count field has incorrect page count. Compare with PageTree children count.	2022-11-30 09:25:23 +10:00
Fred Natzke	afe473e10e	Fix for Issue#512: Unable to open PDF. BruteForceSearcher::GetLastEndOfFileMarker() minimumEndOffset out by 1.	2022-11-29 17:31:23 +10:00
Eliot Jones	9c9c7c99ea	ci new namespace file scope unsupported	2022-10-09 16:01:20 -04:00
Eliot Jones	e2246a88bb	#482 add skip missing fonts option and pass parsing options to content stream processor this doesn't fix the reported issue since the pdf itself is corrupted on page 8 however it will allow recovery in some scenarios where text content isn't important. also adds more informative error when stream unintentionally passed with non zero offset	2022-10-09 13:44:05 -04:00
Eliot Jones	c643facee0	#481 skip resource entry if null token	2022-10-09 13:06:04 -04:00
Eliot Jones	2f9a9ace9a	Merge pull request #473 from grinay/master Fix page number order.	2022-08-13 16:02:44 -04:00
grinay	19962af011	Fix page number order. For case when root nod has reference to page the order will be incorrect. The case if root node has reference [2 0 R 3 0 R 10 0 R] Where 2 0 R is intermediate node containing page 1,2,3 Where 3 0 R is intermediate node containing page 4,5,6 Where 10 0 R is page 7 without that fix 7 page will in array as page 1	2022-07-29 18:27:56 +08:00
Eliot Jones	545d1a0793	Merge branch 'master' of github.com:UglyToad/PdfPig	2022-07-02 18:09:22 -04:00

1 2 3 4 5 ...

1317 Commits