PdfPig

lsm/PdfPig

mirror of https://github.com/UglyToad/PdfPig.git synced 2025-06-28 15:30:17 +08:00

Author	SHA1	Message	Date
BobLd	b462c3bea4	update PublicApiScannerTests	2020-04-12 10:15:12 +01:00
BobLd	064fa4922a	make Clipping internal do not throw errors when CurrentPath is null modify tests to match	2020-04-05 17:58:57 +01:00
Eliot Jones	5f45ee53bd	#127 add basic pdf/a-1b level compliance to the document builder adds color profiles/output intents and an xmp metadata stream to the document in order to be compliant with pdf/a-1b (basic). this compliance level is toggled on the builder since it will generate larger files and set to 'off/none' by default. pdf/a documents are also not able to use standard fonts so using a font when the compliance level is not none will throw.	2020-03-29 16:43:52 +01:00
Eliot Jones	2193063809	fix tests for public api and merge conflict the cross reference parser tests behaviour had change, this fixes a compilation error from merge conflicts. also updates the merger tests to account for new version behaviour and checks the output document text. adds pdfmerger to the public api in the tests.	2020-03-02 17:00:16 +00:00
Eliot Jones	43574097f1	rename marked content elements and use factory since the properties in marked content may be indirect references or belong to the page resources array, the value should be calculated during content processing. this change tidies up the marked content classes so they do not expose mutable data and uses the pdf token scanner overloads to load dictionary data.	2020-01-09 15:30:16 +00:00
BobLd	097692f1cb	Move ArtifactType inside PdfArtifactMarkedContent	2020-01-09 11:24:32 +00:00
BobLd	7be36fdc58	Update PublicApiScannerTests 2	2020-01-08 11:07:27 +00:00
BobLd	4b929482cc	Update PublicApiScannerTests	2020-01-08 10:46:49 +00:00
Eliot Jones	00bd285262	add support for quadpoints to annotations highlight, link, strikeout, squiggly and underline annotation types may define a set of quadrilaterals using the quadpoints entry. this defines the regions to show/activate the annotation. the order of points in the quadpoints array does not match the specification so we provide a convenience class to access the point data rather than interpreting it as a rectangle: https://stackoverflow.com/questions/9855814/pdf-spec-vs-acrobat-creation-quadpoints.	2020-01-05 16:23:07 +00:00
Eliot Jones	b29354e3e6	move compact font format fonts to fonts project	2020-01-05 12:08:01 +00:00
Eliot Jones	d09b33af4d	move tokens to new project	2020-01-05 10:07:01 +00:00
Eliot Jones	1c38a2ae8a	move pdfline to the core project	2020-01-05 09:33:59 +00:00
Eliot Jones	15525acbaa	move document layout analysis and export to new project	2020-01-05 09:19:58 +00:00
Eliot Jones	a6541f1cfc	fix test references update references for unit tests to reference new core and fonts projects. all tests except the public api scanner tests now run successfully.	2020-01-04 22:56:41 +00:00
Eliot Jones	cf1b8651d6	make adler32checksum public there's no reason to keep adler32checksum internal so it is made public in case people find it useful.	2020-01-04 10:27:07 +00:00
Eliot Jones	b15a3a9b57	tidy up truetype tables * improves the naming of truetype related classes. * uses correct numeric type for the loca table. * makes a few related classes public.	2020-01-04 10:27:07 +00:00
BobLd	07f51712c6	Update PublicApiScannerTests	2020-01-03 12:31:23 +00:00
BobLd	3a060d9769	Update PublicApiScannerTests	2019-12-28 14:43:09 +00:00
Eliot Jones	e984180b3d	add method to retrieve any embedded files	2019-12-21 16:16:36 +00:00
Eliot Jones	7296c3c125	merge pull request #105 from BobLd/master whitespace covering algorithm and #104	2019-12-20 11:57:31 +00:00
BobLd	6dba5bb2b4	update PublicApiScannerTests	2019-12-18 11:43:39 +00:00
Eliot Jones	1fb416eee3	add convenience method to retrieve all hyperlinks and their text from annotations on a page	2019-12-18 11:41:02 +00:00
BobLd	1656411fcb	Improving Geometry classes with Tests	2019-12-14 11:41:11 +00:00
Eliot Jones	75a6260501	make cropbox public	2019-12-06 17:34:51 +00:00
Eliot Jones	2e5c995322	make external nodes different to document nodes and finish reimplementation	2019-12-05 13:21:19 +00:00
Eliot Jones	928347bcce	merge pull request #84 from BobLd/master add basic bookmarks extraction capabilities.	2019-12-04 14:24:10 +00:00
Eliot Jones	80f024dbed	make form access public	2019-11-27 16:36:25 +00:00
Eliot Jones	910e22a4e9	wrap checkboxes and radiobuttons in their own form field types with access to the child collections	2019-11-26 16:33:24 +00:00
BobLd	89daa2818e	update PublicApiScannerTests	2019-11-04 15:17:25 +00:00
BobLd	99f260befb	Enhancing NearestNeighbourWordExtractor - Making the code easier to read - Using 20% of Width instead of 60% - Making DefaultWordExtractor public	2019-10-21 20:51:27 +01:00
Eliot Jones	57dfee3211	move alto xml exporter to root export namespace	2019-10-17 10:46:43 +01:00
Eliot Jones	f14c52a05a	fix tests for renaming and validating generate alto xml	2019-10-15 13:59:09 +01:00
BobLd	e76badaeaf	Update PublicApiScannerTests with new public classes	2019-10-11 08:57:16 +01:00
BobLd	fe1a3c4b8b	updated from comments - still need to look at XmlWriter	2019-10-10 12:29:28 +01:00
Eliot Jones	2ef45f71d5	make missing acroform types public and start improving data also changes pages to use a proper tree structure since this will be required for resource inheritance and for acroform widget dictionaries.	2019-10-09 14:28:37 +01:00
BobLd	9ab943e1f9	Merge branch 'master' of https://github.com/UglyToad/PdfPig	2019-10-08 14:16:59 +01:00
Eliot Jones	68bcaf3901	#55 move support for images to page and add inline images support both xobject and inline images. adds unsupported filters so that exceptions are only thrown when accessing lazily evaluated image.bytes property rather than when opening the page. treat all warnings as errors.	2019-10-08 14:04:36 +01:00
BobLd	d939be1b9c	update PublicApiScannerTests 2	2019-10-07 16:09:30 +01:00
BobLd	f4f2b0e3fd	update PublicApiScannerTests	2019-10-07 16:02:11 +01:00
BobLd	93313118e9	Support for hORC, AtloXml and PageXml output formats Tested with: - 'hocrjs' for hORC (see https://unpkg.com/hocrjs) - 'PAGE Viewer' for hORC, AtloXml and PageXml (see http://www.primaresearch.org/tools/PAGEViewer)	2019-10-07 15:19:30 +01:00
Eliot Jones	f5e025aa70	merge pull request #58 from uglytoad/colors adds colors to letters and prepares code to add colors to paths.	2019-08-13 20:50:06 +01:00
Eliot Jones	f55091f3d2	make color types public and add stream based tests to prevent future breaking as observed in #52	2019-08-13 20:48:22 +01:00
Eliot Jones	980e67fabe	Merge pull request #56 from BobLd/master Document Layout Analysis - IPageSegmenter, Docstrum	2019-08-11 14:04:39 +01:00
Eliot Jones	0349bedd3e	#57 add access to document metadata and expose wrapper type	2019-08-11 12:42:30 +01:00
BobLd	c14d77e414	PublicApiScannerTests updated	2019-08-10 16:36:50 +01:00
BobLd	eb9a9fd00e	Document Layout Analysis - IPageSegmenter, Docstrum - Create a TextBlock class - Creates IPageSegmenter - Add other useful distances: angle, etc. - Update RecursiveXYCut - With IPageSegmenter and TextBlock - Make XYNode and XYLeaf internal - Optimise (faster) NearestNeighbourWordExtractor and isolate the clustering algorithms for use outside of this class - Implement a Docstrum inspired page segmentation algorithm	2019-08-10 16:01:27 +01:00
BobLd	801ea3ba7f	Modified PublicApiScannerTests	2019-08-07 14:22:39 +01:00
BobLd	83889cfb52	Document Layout Analysis - Text edges extractor Text edges are where words have either there BoundingBox's left, right or mid coordinate aligned on the same vertical line. Useful to detect tables, justified text, lists, etc.	2019-08-06 15:24:16 +01:00
Eliot Jones	0b9ae1db13	add color information to the operation context. create color classes for letters and paths to use	2019-08-04 16:47:47 +01:00
Eliot Jones	1d551d6de3	add and document core classes for colorspace information	2019-08-04 12:57:06 +01:00

1 2

77 Commits