Commit Graph

68 Commits

Author SHA1 Message Date
EliotJones
b6bd0a3169 bump version to 0.1.12-alpha001 2025-07-26 13:43:28 -05:00
EliotJones
3d2e12cb16 version 0.1.11 2025-07-26 13:16:01 -05:00
EliotJones
85fc63d585 rework numeric tokenizer hot path
the existing numeric tokenizer involved allocations and string parsing. since
the number formats in pdf files are fairly predictable we can improve this
substantially
2025-07-25 18:12:43 +01:00
EliotJones
0586713da3 skip comments in pdf objects streams #926
the file provided in issue #926 contains the following syntax
in pdf object streams:

```
% 750 0 obj
<< >>
```

currently we read the comment token and skip the rest
however this producer is writing nonsense to the stream.
comment tokens are only valid outside streams in pdf files
so we align to the behavior of pdfbox here by skipping the
entire line containing a comment inside a stream which fixes
parsing this file.
2025-07-06 07:13:55 +01:00
BobLd
8f9194c9a4 Miscellaneous minor changes 2025-05-31 23:02:46 +01:00
BobLd
f84f2aceec Improve memory allocation by changing IFilter.Decode() signature to use Memory<byte> instead of ReadOnlyMemory/ReadOnlySpan 2025-05-29 12:41:50 +01:00
BobLd
ca9f70ffb0 Skip control chars in CoreTokenScanner.MoveNext() and fix #1048 2025-05-27 20:57:38 +02:00
BobLd
a4a0fe220a Bump version to 0.1.11-alpha001
Some checks are pending
Build and test / build (push) Waiting to run
Run Integration Tests / build (push) Waiting to run
2025-03-08 13:42:57 +00:00
BobLd
d36e9a900f version 0.1.10 2025-03-08 13:00:43 +00:00
BobLd
bcc8ccecbe Stop treating Warnings as Errors (#941) 2024-11-23 18:23:22 +00:00
Eliot Jones
c46722fa26 version 0.1.9
Some checks failed
Build and test / build (push) Has been cancelled
Run Integration Tests / build (push) Has been cancelled
2024-10-06 15:40:24 +01:00
Jason Nelson
c6a7a2d0a2 Improve Code Quality (#831)
* Introduce globals

* Spanify TransformationMatrix.FromArray

* Eliminate allocation in GeometryExtensions.ParametricPerpendicularProjection

* Eliminate allocation in CrossReferenceTablePart.Parse

* Optimize Adam7 (eliminate virtual calls)

* Spanify QuadPointsQuadrilateral.Points to eliminate virtual calls

* Eliminate allocation in PdfRectangle.Normalize

* Format TransformMatrix

* Pass TransformationMatrix  by reference in TransformationMatrix.Multiply

* Seal NoTextTokenWriter
2024-05-06 07:38:06 +01:00
Jason Nelson
da44e1a540 Improve code quality (#825)
* Avoid encoding ASCII in more cases

* Make Space a const

* Use WriteWhiteSpace extension to eliminate possible virtual call

* Use ASCII when encoding constrained character subset

* Simplify pragmas

* Revert Whitespace rename

* Fix using statement order

* Remove obsolete serialization support on .NET

* Remove obsolete serialization support on .NET (part 2)
2024-05-03 07:36:19 +01:00
Jason Nelson
7f42a8d60c Reduce Allocations (#821)
* Introduce ValueStringBuilder

* Make NumericTokenizer and PlanTextTokenizer thread-safe

* Replace ListPool with ArrayPoolBufferWriter

* Seal ITokenizer classes

* Eliminate array allocation in Type1ArrayTokenizer

* Eliminate array allocation in AcroFormFactory

* Eliminate StringBuilder allocation in Page.GetText

* Optimize PdfSubpath.ToLines

* Eliminate various allocations when parsing CompactFontFormat

* Remove unused FromOctalInt helper

* Ensure Pdf.Content is not null

* Write ASCII values directly to stream (avoiding allocations)

* Avoid encoding additional ASCII values

* Eliminate allocations in TokenWriter.WriteName

* Eliminate allocation in TokenWriter.WriteNumber

* Add System.Memory reference to Fonts
2024-04-28 18:55:58 +01:00
Jason Nelson
1ef2e127a6 Improve Code Quality (#818)
* Make AdobeFontMetricsLigature a struct

* Make AdobeFontMetricsCharacterSize a struct

* Eliminate allocation in CompactFontFormatData

* Pass TransformationMatrix  by reference

* Seal Encoding classes

* Make SubTableHeaderEntry a readonly struct

* Introduce StringSplitter and eliminate various allocations in GlyphListFactory

* Eliminate a few substring allocations

* Use char overload on StringBuilder

* Eliminate virtual calls on stringIndex

* Optimize ReadHelper ReadLong and ReadInt methods

* Add additional readonly annotations to PdfRectangle

* Optimize NameTokenizer

* Eliminate allocation in TrueTypeGlyphTableSubsetter

* Use empty arrays

* Eliminate allocations in OperationWriteHelper.WriteHex

* Use simplified DecryptCbc method on .NET 6+

* Fix windows-1252 encoding not working on net6.0 and 8.0

* Update int buffers to exact unsigned max length and eliminate additional byte allocation

* Fix typo

* Remove unused constant
2024-04-18 19:58:40 +01:00
Jason Nelson
6d54355754 Spanify filters 2024-04-12 07:42:19 +01:00
Jason Nelson
f62929eb7c Spanify work 1 (#812)
* Add GetString(ReadOnlySpan<byte>) polyfill

* Add ArrayPoolBufferWriter

* Use Utf8.IsValid & char.IsAsciiHexDigit on NET8.0+

* Optimize HexTokenizer

* Eliminate various Tuple allocations

* Eliminate List allocation in CrossReferenceTable

* Eliminate various allocations in Ascii85Filter

* Spanify HexToken

* Spanify Palette

* Spanify various Cmap & font methods

* Spanify Type1Charstring classes

* Spanify PdfDocEncoding.TryConvertBytesToString

* Spanify OctalHelpers.FromOctalDigits

* Add missing braces

* React to HexToken.Byte type changes

* Cleanup

* [Tests] React to span changes

* Add ArgumentNullException check back to Type1CharstringDecryptedBytes

* Remove unsafe code

* Seal HexToken

* Avoid allocation when passing an empty span
2024-04-01 09:18:01 +01:00
Jason Nelson
907181d5ce Bump .net4.7 target to .net4.7.1 and eliminate System.ValueTuple depedency 2024-03-15 13:10:25 +00:00
Jason Nelson
4653671b2d Set LangVersion to 12 2024-03-15 13:10:25 +00:00
Jason Nelson
6da9c90042 Add .NET 8.0 target 2024-03-15 13:10:25 +00:00
Jason Nelson
8e0500e833 Drop unsupported .NET frameworks 2024-03-15 13:10:25 +00:00
BobLd
acfe8b5fdd Allow lenient parsing in DictionaryTokenizer and fix #791 2024-03-11 20:01:07 +00:00
BobLd
9f3d2745f6 Change NumericToken from IDataToken<decimal> to IDataToken<double> and fix #765 2024-02-18 14:53:38 +00:00
Eliot Jones
129e69fdf9 Merge pull request #638 from UglyToad/pdfdocencoding
Implement PdfDocEncoding for reading string tokens
2023-06-05 22:01:25 +01:00
Eliot Jones
2be4c69c16 set nightly release version
also toggles class back to public for nightly
versions #538
2023-06-05 21:37:34 +01:00
Eliot Jones
16ac297d10 version 0.1.8 2023-06-05 21:36:00 +01:00
Eliot Jones
6f59bed9a2 use pdfdocencoding when parsing strings 2023-06-04 16:40:43 +01:00
Eliot Jones
fc2f7b9325 add intelligent error recovery for known dictionaries #511
if we're parsing a known dictionary (e.g. all keys are required
and there are no additional optional keys) and we encounter
an error we provide the possibility to recover by assuming
a dictionary end token after all required tokens are consumed
if parsing by looking for dictionary end failed due to a format
exception
2023-05-21 14:58:39 +01:00
Eliot Jones
35ff13732e remove completely out of support net 4.5 2023-05-17 20:20:05 +01:00
Eliot Jones
d2944e14e5 change alpha version for nightly builds 2023-01-08 12:01:48 -05:00
Eliot Jones
f2188729a3 #453 handle messed up number format 2022-06-17 20:35:21 -04:00
Eliot Jones
03692cf42f set version to alpha of 0.1.7 for future nightly builds 2022-04-25 10:06:46 -04:00
Eliot Jones
5597a8f38c version 0.1.6 2022-04-25 09:22:47 -04:00
Eliot Jones
d7898d851c add net 6 as a target framework and dual target tests 2022-04-03 14:48:50 -04:00
Eliot Jones
a538aaf0de bump version so that nightly builds for pre-release have correct version 2022-01-10 12:40:47 +00:00
Eliot Jones
4c36f84a0d version 0.1.5 2021-09-17 11:16:43 -04:00
Eliot Jones
1b472f6992 handle messed up numbers in content #355 2021-08-11 20:56:06 -04:00
Eliot Jones
16d26effc5 0.1.5-alpha002 2021-05-09 13:05:54 -04:00
Eliot Jones
b930924b9b 0.1.5-alpha001 2021-02-28 13:55:25 -04:00
Plaisted
a0f0c4d6c7 switch to old syntax for build server 2021-01-19 18:53:44 -06:00
Plaisted
4c807691b7 adding in PlainTokenizer to unpooled SB changes 2021-01-19 18:52:14 -06:00
Plaisted
feb6117e1e fix EOL issues 2021-01-19 18:39:51 -06:00
Plaisted
0b716a759f adding comment for non-static tokenizer 2021-01-19 18:18:33 -06:00
Plaisted
9bfe69aef1 removing locking 2021-01-19 18:06:50 -06:00
Eliot Jones
237fd96f9e version 0.1.4 2020-11-29 14:02:42 -04:00
Eliot Jones
ad0fb4ec5b version 0.1.3 2020-11-15 12:08:37 -04:00
Eliot Jones
58ecfbf963 0.1.3-alpha001 2020-09-04 13:19:03 +01:00
Eliot Jones
98af575ee3 version 0.1.2 2020-07-04 16:55:14 +01:00
Eliot Jones
5fb04582a7 0.1.2-alpha003 2020-06-20 12:54:31 +01:00
Eliot Jones
256c2833ab 0.1.2-alpha002 2020-05-10 16:36:14 +01:00