Add PdfPig.Rendering.Skia as way to extract pdf images

BobLd
2025-10-05 11:27:56 +01:00
parent bb338a09df
commit ea308d1a4a

27
FAQ.md

@@ -92,6 +92,33 @@ using (var doc = PdfDocument.Open("my_document.pdf", parsingOption))
}
}
```
# Some Images Cannot Be Extracted (alternative)
PdfPig not being an image library, it has limitation when handling images (e.g. image resizing).
For a more robust image extraction, you can also use https://github.com/BobLd/PdfPig.Rendering.Skia, which relies on [SkiaSharp](https://github.com/mono/SkiaSharp).
First, install the `PdfPig.Rendering.Skia` NuGet package. Once done, it can be used as follow:
```csharp
using UglyToad.PdfPig.Rendering.Skia.Helpers;
[...]
using (var document = PdfDocument.Open(_path, SkiaRenderingParsingOptions.Instance))
{
for (int p = 1; p <= document.NumberOfPages; p++)
{
var page = document.GetPage(p);
foreach (var pdfImage in page.GetImages())
{
var skImage = pdfImage.GetSKImage();
// Use SKImage
}
}
}
```
# Duplicate Letters or Words Are Extracted
This is often the case when the document was created using "fake bold" to bold letters. When this is the case, the document creator will duplicate each letter that are supposed to be bold with a slight offset, creating a thicker appearance. When this method is used, the PdfPig letter object will not be flagged as bold.