filter provider becomes single instance and no longer has constructor parameters.
tokenizers use list and stringbuilder pools to reduce allocations.
system font finder becomes static to preserve file cache across all documents.
it is possible for a file with an adobe type 1 font to contain an invalid charstring sequence, if this happens we handle it and return false from trygenerate.
after we split the solution into multiple projects the xml doc comments were no longer packed in the generated nuget package. in addition they were only generated for the net standard 2.0 target framework.
this change generates comments for all target frameworks and makes sure they're included in the generated package. it also adds missing doc comments where they weren't included on the public api and clears up a couple of minor formatting issues in the affected files.
the individual cff parser uses a cff dictionary reader inside it which has a per-instance operands list, for this reason it is not thread-safe and cannot be shared. this change creates a new individual font parser for each call to the top-level cff parser.
where the value is 28 the next two bytes indicate a short, not a 16 bit two's complement number, apparently, or i've misunderstood what the two's complement bit is about...
the byte value of 28 indicates the next 2 bytes are a 16 bit two's complement number rather than just a short. this changes the calculation to generate the two's complement value correctly.
an encoding array in an adobe type 1 font may be missing its declaration ending in 'for', if we encounter 'dup' while looking for the 'for' token we have a special case to go straight into reading the encoding.
also handles a case where the page content stream contains a path-closing operator without any path being active.
it appears charstring definitions in adobe type 1 fonts can omit the charstring name. in this case we set the name to the string value of the charstring index.
if the font uses a named encoding which is not recognised, use the corresponding encoding based on the base font name, or fall back to windows ansi encoding.
glyph data in TrueType fonts can be very large and slow to parse. to avoid this we store the raw table data at parsing time and enable lazy loading of glyph descriptions.
by default nuget pack does not include project dependencies. this is suboptimal since it would require managing at least 5 nuget packages. this uses a workaround detailed here https://github.com/nuget/home/issues/3891 to copy the dependent dlls to the generated nuget package. this doesn't resolve the issue of how we publish the documentlayoutanalysis project, since it is the top of the dependency tree and we publish its parent, rather than it.
the 3 font types mentioned are moved to the new fonts project, any referenced types are moved to the core project. most truetype classes are made public #8.
to make the project more useful and expose more usable classes we're rearchitecting in the following way. code used to read fonts from external file formats like truetype, adobe font metrics (afm) and adobe type 1 fonts are moving to a new project which doesn't reference most of the pdf logic. the shared logic is moving to a new flat-structured project called core. this is a sort-of onion type architecture, with core being the... core, fonts being the next layer of the onion, pdfpig itself the next. this will then support additional libraries/projects as outer layers of the onion as well as releasing standalone version of the font library as pdfbox does with fontbox.