PdfPig

lsm/PdfPig

mirror of https://github.com/UglyToad/PdfPig.git synced 2026-03-10 00:23:29 +08:00

Author	SHA1	Message	Date
Jason Nelson	7f42a8d60c	Reduce Allocations (#821 ) * Introduce ValueStringBuilder * Make NumericTokenizer and PlanTextTokenizer thread-safe * Replace ListPool with ArrayPoolBufferWriter * Seal ITokenizer classes * Eliminate array allocation in Type1ArrayTokenizer * Eliminate array allocation in AcroFormFactory * Eliminate StringBuilder allocation in Page.GetText * Optimize PdfSubpath.ToLines * Eliminate various allocations when parsing CompactFontFormat * Remove unused FromOctalInt helper * Ensure Pdf.Content is not null * Write ASCII values directly to stream (avoiding allocations) * Avoid encoding additional ASCII values * Eliminate allocations in TokenWriter.WriteName * Eliminate allocation in TokenWriter.WriteNumber * Add System.Memory reference to Fonts	2024-04-28 18:55:58 +01:00
Jason Nelson	6d54355754	Spanify filters	2024-04-12 07:42:19 +01:00
BobLd	acfe8b5fdd	Allow lenient parsing in DictionaryTokenizer and fix #791	2024-03-11 20:01:07 +00:00
Eliot Jones	6f59bed9a2	use pdfdocencoding when parsing strings	2023-06-04 16:40:43 +01:00
Eliot Jones	fc2f7b9325	add intelligent error recovery for known dictionaries #511 if we're parsing a known dictionary (e.g. all keys are required and there are no additional optional keys) and we encounter an error we provide the possibility to recover by assuming a dictionary end token after all required tokens are consumed if parsing by looking for dictionary end failed due to a format exception	2023-05-21 14:58:39 +01:00
Plaisted	4c807691b7	adding in PlainTokenizer to unpooled SB changes	2021-01-19 18:52:14 -06:00
Plaisted	feb6117e1e	fix EOL issues	2021-01-19 18:39:51 -06:00
Plaisted	0b716a759f	adding comment for non-static tokenizer	2021-01-19 18:18:33 -06:00
Plaisted	9bfe69aef1	removing locking	2021-01-19 18:06:50 -06:00
Eliot Jones	693a3d5958	use offset to file header to correct cross references if the %pdf version header comment is offset from the start of the file the cross reference offsets will also be wrong by that amount. this change updates the cross reference location logic to use the offset from the located version header.	2020-01-26 15:30:20 +00:00
Eliot Jones	ba09a13d08	more end image recovery logic since inline image data may contain the end image "ei" token inside the data stream there's no reliable way to actually determine if we've read all the data. for this reason if we end up with an invalid state parsing operations after we've read the end image token we try to recover by reading from the previous token to the next end image token if any. we supply log information to let the consumer know this is what we're doing. it's still not bullet-proof but it should be good enough. also support negative page rotation values by adding them to a 360 degree rotation so -90 degrees clockwise is 270 degrees clockwise.	2020-01-25 15:53:08 +00:00
Eliot Jones	bbde38f656	move tokenizers to their own project since both pdfs and Adobe Type1 fonts use postscript type objects, tokenization is needed by the main project and the fonts project	2020-01-05 10:40:44 +00:00

12 Commits