Ok, this is a brain-dump of everything I've learned about MDB files. I'm am
using Access 97, so everything I say applies to that and maybe or maybe not
other versions.
Right, so here goes:
Note: It appears that much of the data in the pages is unitialized garbage.
This makes the task of figuring out the format a bit more challenging.
Pages
-----
MDB files are a set of pages. These pages are 2K (2048 bytes) in size, so in a
hex dump of the data they start on adreeses like xxx000 and xxx800.
The first byte of each page seems to be a type indentifier for instance the
first page in the mdb file is 0x00, which no other pages seems to share. Other
pages have values of 0x01, 0x02, 0x03, 0x04 though the exact meaning of these
is currently a mystery. (0x04 seems to be data I guess).
The second byte is always 0x01 as far as I can tell.
At some point in the file the page layout is apparently abandoned though the
very last 2K in the file again looks like a valid page. The purpose of this
non-paged region is so far unknown .
Bytes after the first and second seemed to depend on the type of page, although bytes 4-7 seem to indicate a page type of some sort. 02 00 00 00 is found on all catalog pages.
Pages seem to have two parts, a header and a data portion. The header starts
at the front of the page and builds up. The data is packed to the end of the
page. This means the last byte of the data portion is the last byte of the
page.
Byte Order
----------
All offsets to data within the file are in little endian (intel) order
Catalogs
--------
So far the first page of the catalog has always been seen at 0x9000 bytes into
the file. It is unclear whether this is always where it occurs, or whether a
pointer to this location exists elsewhere.
The header to the catalog page(s) start look something like this: