- Pass in the initial matrix to the annotation provider, so that it can return
the correct rectangles / quad points.
- Made a change / extensions to the Annotation class:
- ModifiedDate is now a DateTimeOffset instead of unparsed string.
If the string is invalid, ModifiedDate is set to the default value.
- Added lookup for the "appearance streams"; all the annotations should have
a "N" (normal) appearance, and optionally have a "R" (roll-over/hover)
and "D" (down/click) appearance. Did not expose the actual stream objects,
but added a flag indicating the existence of "R" / "D". At some point
we can consider doing something with the appearances.
- Changed signature of GetInitialMatrix / ContentStreamProcessor constructor
from PdfRectangle back to what it was earlier, namely MediaBox and CropBox,
to prevent accidentally mixing the two up in the caller.
The initial transformation matrix was incorrect, as it translated by the cropbox width/height
instead of by the cropbox left/bottom offsets. Also, it did not translate the results back into
the 1st quadrant so that (0,0) would (again) be the lower left corner origin for the cropped area.
Added unit tests in new file ContentStreamProcessorTests.
EFFECTIVE CHANGES:
- The coordinates used for letters etc. are different now for rotated and/or cropped pages,
but as those were not very consistent anyway this is probably OK.
- The Page Size (A4, A3, Custom, etc.), Width and Height are now determined by the CropBox,
not by the MediaBox; the CropBox ultimately determines what you see on screen and is printable.
If no cropbox is defined in the PDF, it is set to the MediaBox; so in that case it is
backwards compatible with the old code.
- The Page MediaBox and CropBox properties are no longer rotated according to Page.Rotation.
The Page Width and Height do take rotation into account (kept it backward compatible).
constructor arguments. Kept property Color, which contains either StrokeColor (if rendering mode is Stroke)
or FillColor (for all other rendering modes).
In PdfPageBuilder opted for default text rendering mode "Fill" which seems like a sensible default.
in making it possible to override methods in the token writer, for example to filter
streams when writing using PdfDocumentBuilder.
The second step is injecting ITokenWriter into PdfDocumentBuilder.
this doesn't fix the reported issue since the pdf itself is corrupted on page 8 however it will
allow recovery in some scenarios where text content isn't important.
also adds more informative error when stream unintentionally passed with non zero offset
For case when root nod has reference to page the order will be incorrect.
The case if root node has reference
[2 0 R 3 0 R 10 0 R]
Where 2 0 R is intermediate node containing page 1,2,3
Where 3 0 R is intermediate node containing page 4,5,6
Where 10 0 R is page 7
without that fix 7 page will in array as page 1