PdfPig

lsm/PdfPig

mirror of https://github.com/UglyToad/PdfPig.git synced 2026-03-10 00:23:29 +08:00

Author	SHA1	Message	Date
BobLd	689c127cd9	Get correct text orientation when base line points are equal and fix #741	2024-09-29 16:43:21 +01:00
Arnaud TAMAILLON	cf45dcf6ad	Support not finding the Pages dictionary in lenient mode (#897 ) * Support not finding the Pages dictionary in lenient mode and support Kids object not referencing a page object in lenient mode --------- Co-authored-by: Arnaud TAMAILLON <arnaud.tamaillon@younited-credit.fr>	2024-09-01 15:09:48 +01:00
BobLd	affc1ed8b5	Seal and update IFilters to return ReadOnlyMemory<byte> (#843 ) * Avoid ToArray() in memoryFactory * Seal and update IFilters to return ReadOnlyMemory<byte> * Fix filter tests * Seal and update IFilters to return ReadOnlyMemory<byte>	2024-06-08 06:16:09 +01:00
Jason Nelson	da44e1a540	Improve code quality (#825 ) * Avoid encoding ASCII in more cases * Make Space a const * Use WriteWhiteSpace extension to eliminate possible virtual call * Use ASCII when encoding constrained character subset * Simplify pragmas * Revert Whitespace rename * Fix using statement order * Remove obsolete serialization support on .NET * Remove obsolete serialization support on .NET (part 2)	2024-05-03 07:36:19 +01:00
Jason Nelson	7f42a8d60c	Reduce Allocations (#821 ) * Introduce ValueStringBuilder * Make NumericTokenizer and PlanTextTokenizer thread-safe * Replace ListPool with ArrayPoolBufferWriter * Seal ITokenizer classes * Eliminate array allocation in Type1ArrayTokenizer * Eliminate array allocation in AcroFormFactory * Eliminate StringBuilder allocation in Page.GetText * Optimize PdfSubpath.ToLines * Eliminate various allocations when parsing CompactFontFormat * Remove unused FromOctalInt helper * Ensure Pdf.Content is not null * Write ASCII values directly to stream (avoiding allocations) * Avoid encoding additional ASCII values * Eliminate allocations in TokenWriter.WriteName * Eliminate allocation in TokenWriter.WriteNumber * Add System.Memory reference to Fonts	2024-04-28 18:55:58 +01:00
Jason Nelson	1ef2e127a6	Improve Code Quality (#818 ) * Make AdobeFontMetricsLigature a struct * Make AdobeFontMetricsCharacterSize a struct * Eliminate allocation in CompactFontFormatData * Pass TransformationMatrix by reference * Seal Encoding classes * Make SubTableHeaderEntry a readonly struct * Introduce StringSplitter and eliminate various allocations in GlyphListFactory * Eliminate a few substring allocations * Use char overload on StringBuilder * Eliminate virtual calls on stringIndex * Optimize ReadHelper ReadLong and ReadInt methods * Add additional readonly annotations to PdfRectangle * Optimize NameTokenizer * Eliminate allocation in TrueTypeGlyphTableSubsetter * Use empty arrays * Eliminate allocations in OperationWriteHelper.WriteHex * Use simplified DecryptCbc method on .NET 6+ * Fix windows-1252 encoding not working on net6.0 and 8.0 * Update int buffers to exact unsigned max length and eliminate additional byte allocation * Fix typo * Remove unused constant	2024-04-18 19:58:40 +01:00
Jason Nelson	c3a2e8c08f	Rename TryGetMemory -> TryGetBytesAsMemory	2024-04-12 07:42:19 +01:00
Jason Nelson	a5e9b438cc	Eliminate allocation in BasePageFactory	2024-04-12 07:42:19 +01:00
Jason Nelson	49c155cca2	Add IPdfImage.RawBytes	2024-04-12 07:42:19 +01:00
Jason Nelson	6d54355754	Spanify filters	2024-04-12 07:42:19 +01:00
Jason Nelson	f62929eb7c	Spanify work 1 (#812 ) * Add GetString(ReadOnlySpan<byte>) polyfill * Add ArrayPoolBufferWriter * Use Utf8.IsValid & char.IsAsciiHexDigit on NET8.0+ * Optimize HexTokenizer * Eliminate various Tuple allocations * Eliminate List allocation in CrossReferenceTable * Eliminate various allocations in Ascii85Filter * Spanify HexToken * Spanify Palette * Spanify various Cmap & font methods * Spanify Type1Charstring classes * Spanify PdfDocEncoding.TryConvertBytesToString * Spanify OctalHelpers.FromOctalDigits * Add missing braces * React to HexToken.Byte type changes * Cleanup * [Tests] React to span changes * Add ArgumentNullException check back to Type1CharstringDecryptedBytes * Remove unsafe code * Seal HexToken * Avoid allocation when passing an empty span	2024-04-01 09:18:01 +01:00
Jason Nelson	a412a239be	Enable nullable annotations (#803 ) * Enable nullable annotations * Remove unused Jetbrain annotations * Ensure system using statements are first * Improve nullability annotations * Annotate encryptionDictionary is non-null when IsEncrypted is true * Disable nullable for PdfTokenScanner.Get * Improve nullability annotations for ObjectLocationProvider.TryGetCached * Revert changes to RGBWorkingSpace * Update UglyToad.PdfPig.Package with new framework targets (fixes nightly builds)	2024-03-17 18:51:40 +00:00
Jason Nelson	95f0459900	Prefer is null to == null ensures that an equals overload isn't use, and we don't compare structs	2024-03-16 12:37:51 +00:00
Jason Nelson	9859c2672b	Use switch expressions	2024-03-16 12:37:51 +00:00
Jason Nelson	834fb350a3	Use Array.Empty	2024-03-15 13:10:25 +00:00
BobLd	ac0276f1bf	Use double in fonts instead of decimals and tidy up remaining decimals	2024-03-06 20:53:11 +00:00
BobLd	3bdc9498de	Use double for pdf version instead of decimal	2024-02-14 21:09:16 +00:00
BobLd	04fc8d696d	Use double instead of decimal in IPdfImage's Decode property	2024-01-20 18:52:26 +00:00
Richard Webb	83519b27b1	Add a DynamicallyAccessedMembers attribute to AddPageFactory<TPage, TPageFactory>	2024-01-11 19:07:35 +00:00
BobLd	3a96af3dcd	Add GetPage<TPage> and AddPageFactory<TPage and TPageFactory> methods	2023-11-18 16:46:23 +00:00
BobLd	3fbf8aaa6c	Abstract away PageFactory main logic into BasePageFactory	2023-11-18 16:46:23 +00:00
BobLd	63096de210	Add IPageFactory to the public API, remove InternalParsingOptions	2023-10-25 20:03:02 +01:00
BobLd	ba865b340e	Make IResourceStore part of the public API and pass InternalParsingOptions to the ResourceStore constructor	2023-10-22 19:16:41 +01:00
BobLd	7ab3a6a2cd	Add more classes to the Public API (#717 ) * Made interfaces and classes public * Made NamedDestinations public and fixed namespace * Fixed NamedDestinationsProvider namespace * Expose more internal classes as public * Update PublicApiScannerTests * Make setters internal	2023-10-22 17:34:47 +01:00
BobLd	c6e2de1b0c	Make all structs readonly when possible	2023-10-18 23:44:11 +01:00
BobLd	fe0e4db419	Properly handle page rotation for crop box and media box and fix #665	2023-10-18 21:27:56 +01:00
Eliot Jones	9d2b3f914d	account for skipmissingfonts in positioned text #637	2023-06-04 11:47:30 +01:00
Eliot Jones	fba1cbc13c	skip missing objects if skip fonts is true #298 if skip missing fonts is set we want to read the file as much as possible so we will also skip any missing xobjects like images, forms or postscript code	2023-05-27 10:46:29 +01:00
Eliot Jones	20d3cc9066	tidy up during investigation #600	2023-05-23 19:22:00 +01:00
BobLd	a4284aa5a8	Implement Pattern color space and Shading, seal IColor classes, stop using decimal in colors and use double instead	2023-05-18 20:24:55 +01:00
Yufei Huang	3898f09a5f	Pdf merger support copy links	2023-04-22 13:54:31 +01:00
BobLd	42e4171c31	Fix integration tests for #579 (3)	2023-04-13 19:27:54 +01:00
BobLD	b8a98fbed2	Properly implement color spaces	2023-04-12 07:25:09 +01:00
mvantzet	0e39bc0b76	Annotations named destinations (#579 ) * Add Named Destinations to Catalog so that bookmarks and links can access them. The named destinations require access to page nodes, so created Pages object that is made using PagesFactory (which contains the page-related code from Catalog). * Further implementation of destinations: - Implement NamedDestinations in AnnotationProvider, so that we can look up named destinations for annotations and turn them into explicit destinations. Reused existing code inside BookmarksProvider to get destinations/actions. - Added GoToE action - According to the PDF reference, destinations are also required for external destinations and hence for ExternalBookmarkNode. This allows us to push up DocumentBookmarkNode.Destination to BookmarkNode. * Implemented stateful appearance streams and integration test * Added AppearanceStream to public API because it is used in the (public) Annotation constructor * After #552, must push down ExplicitDestination do DocumentBookmarkNode since it does not apply to UriBookmarkNode. * Added actions, which fits the PDF model better and works well with the new bookmarks code (after PR #552) * Rename Action to PdfAction + removed unused using in ActionProvider.cs --------- Co-authored-by: mvantzet <mark@radialsg.com>	2023-04-10 17:14:14 +01:00
mvantzet	76ce251a6e	Merge branch 'UglyToad:master' into PageSizesAndRotation	2023-03-17 19:35:55 +01:00
mvantzet	a07fdb8d45	Follow suggestion by @BobLd, added 2 more test cases for if we want to support more lenient page size parsing in the future.	2023-03-14 12:33:21 +01:00
mvantzet	0413f3f1bf	Fix related to page sizes / rotation / coordinate transformations (issue 560): The initial transformation matrix was incorrect, as it translated by the cropbox width/height instead of by the cropbox left/bottom offsets. Also, it did not translate the results back into the 1st quadrant so that (0,0) would (again) be the lower left corner origin for the cropped area. Added unit tests in new file ContentStreamProcessorTests. EFFECTIVE CHANGES: - The coordinates used for letters etc. are different now for rotated and/or cropped pages, but as those were not very consistent anyway this is probably OK. - The Page Size (A4, A3, Custom, etc.), Width and Height are now determined by the CropBox, not by the MediaBox; the CropBox ultimately determines what you see on screen and is printable. If no cropbox is defined in the PDF, it is set to the MediaBox; so in that case it is backwards compatible with the old code. - The Page MediaBox and CropBox properties are no longer rotated according to Page.Rotation. The Page Width and Height do take rotation into account (kept it backward compatible).	2023-03-09 16:42:09 +01:00
mvantzet	3a0a6e1411	Resolving page sizes did not work when the page orientation was landscape, or when side lengths were not integer or off by one. Added unit tests.	2023-03-09 16:09:14 +01:00
BobLD	c56705d4ff	Implement pdf functions and add type 0, 2 and 4 function tests	2023-03-08 18:59:16 +00:00
mvantzet	06253966e4	Added Letter properties RenderingMode, StrokeColor, FillColor and added those as mandatory constructor arguments. Kept property Color, which contains either StrokeColor (if rendering mode is Stroke) or FillColor (for all other rendering modes). In PdfPageBuilder opted for default text rendering mode "Fill" which seems like a sensible default.	2023-01-13 12:35:25 +01:00
Eliot Jones	37e31c40ae	Merge pull request #522 from fnatzke/master Fix #514 Print Character with ZapfDingbats font	2023-01-08 11:52:15 -05:00
Fred Natzke	8f13e2f11c	Fix #375 Add checkmark from ZapfDingbats using unicode character. Letters extracted have unicode values.	2022-12-17 17:39:15 +10:00
Eliot Jones	c8874c5984	#483 make skip missing fonts even more resilient to nonsense files	2022-12-11 16:18:09 -05:00
Eliot Jones	2aed996319	Merge pull request #517 from fnatzke/master Fixes for Issue#512, 516 and 519	2022-12-09 09:39:55 -05:00
Eliot Jones	95df15996b	add ability to rotate page by number of degrees, make builder fluent	2022-12-09 08:44:56 -05:00
Fred Natzke	29adece983	Original #516 fix causes Tests to fail. Alternative to use pagesByNumber?.Count in Catalog for discovered pages. Some other unrelated tests failed due to source using single rather than two character newlines. Changed to test string itself for '\r' rather than environment is Unix	2022-11-30 11:10:52 +10:00
Fred Natzke	a7f64ec64b	Fix Issue 516. Page Dictionary Count field has incorrect page count. Compare with PageTree children count.	2022-11-30 09:25:23 +10:00
Eliot Jones	e2246a88bb	#482 add skip missing fonts option and pass parsing options to content stream processor this doesn't fix the reported issue since the pdf itself is corrupted on page 8 however it will allow recovery in some scenarios where text content isn't important. also adds more informative error when stream unintentionally passed with non zero offset	2022-10-09 13:44:05 -04:00
Eliot Jones	c643facee0	#481 skip resource entry if null token	2022-10-09 13:06:04 -04:00
grinay	19962af011	Fix page number order. For case when root nod has reference to page the order will be incorrect. The case if root node has reference [2 0 R 3 0 R 10 0 R] Where 2 0 R is intermediate node containing page 1,2,3 Where 3 0 R is intermediate node containing page 4,5,6 Where 10 0 R is page 7 without that fix 7 page will in array as page 1	2022-07-29 18:27:56 +08:00

1 2 3 4

191 Commits