Commit Graph

5 Commits

Author SHA1 Message Date
Eliot Jones
07df6fd740 read last line of ignore file (#1155)
* read last line of ignore file

- do not cancel other matrix jobs if one test fails
- read all lines of the ignore list even if it doesn't end with a newline
- add ignore list for 0008 and 0009

* support missing object numbers when brute-forcing

the file 10404 (ironically) contains not found references with number 43 0
for its info dictionary. changes brute-force code so that objects can be
entirely missing

* fix test since document is now opened successfully but mediabox is broken
2025-09-13 16:57:35 +02:00
EliotJones
77db6c6b54 add test jobs for common crawl 0000 to 0007 2025-09-13 14:52:04 +01:00
BobLd
c8dff885bd Update run_common_crawl_tests.yml 2025-08-03 08:56:17 +01:00
BobLd
a5e92cd11c Update run_common_crawl_tests.yml 2025-07-19 12:21:10 +01:00
EliotJones
4bf746c747 add new action to run integration against common crawl corpus 2025-07-19 11:49:34 +01:00