Updated Document Layout Analysis (markdown)

davebrokit
2024-05-30 09:27:54 +01:00
parent a62869395c
commit 0d8e113f50

@@ -177,12 +177,23 @@ using (var document = PdfDocument.Open(“document.pdf”))
foreach (var block in blocks)
{
// Do something
// E.g. Output the blocks
foreach (TextLine line in block.TextLines)
{
foreach (Word word in line.Words)
{
Console.Write(word.Text + " ");
}
}
}
}
}
```
#### Advanced cases
The method can be tailored by providing a __minimum block width__, and __horizontal and vertical gap sizes/functions__:
- Minimum block width is set to 1/3 of page width: