updates to the format documentation from various folks

This commit is contained in:
Brian Bruns
2011-03-20 15:29:53 -04:00
parent 5e4e31d1c9
commit 335115912d

59
HACKING
View File

@@ -78,9 +78,13 @@ Not a lot is known about this page, and it is one of the least documented page
types. However, it contains things like Jet version, encryption keys, and name
of the creating program.
Offset 0x14 contains the Jet version of this database: 0x00 for 3, 0x01 for 4.
Offset 0x14 contains the Jet version of this database: 0x00 for 3, 0x01 for 4,
0x02 for 5, 0x03 for Access 2010.
This is used by the mdb-ver utility to determine the Jet version.
The 14 bytes starting at 0x42 are the (encrypted) database password.
The 4 bytes at 0x3e on the Database Definition Page are the database key.
Data Pages
----------
@@ -220,6 +224,13 @@ else
The length (partial if type 2) for the memo is:
memo_page_len = offset_stop - offset_start
Update: The bitmask can't be an entire byte long.
OLE fields can hold up to 1gig. That requires at least 30 bits, leaving only 2
bits for flags. Maybe sometimes 0xC0000000 is ignored?
See http://office.microsoft.com/en-us/access-help/access-2007-specifications-HA010030739.aspx
Number of characters in a Memo field: 65,535 when entering data through the
user interface; 2 gigabytes of character storage when entering data
programmatically. That would mean 31 bits for length.
LVAL (Long Value) Pages
-----------------------
@@ -286,7 +297,7 @@ next_pg field.
| ???? | 2 bytes | num_var_cols| Number of variable columns in table |
| ???? | 2 bytes | num_cols | Number of columns in table (repeat) |
| ???? | 4 bytes | num_idx | Number of indexes in table |
| ???? | 4 bytes | num_real_idx| Number of indexes in table (repeat) |
| ???? | 4 bytes | num_real_idx| Number of index entries |
| ???? | 4 bytes | used_pages | Points to a record containing the |
| | | | usage bitmask for this table. |
| ???? | 4 bytes | free_pages | Points to a similar record as above, |
@@ -307,7 +318,7 @@ next_pg field.
| ???? | 1 byte | precision | precision if numeric column |
| ???? | 1 byte | scale | scale if numeric column |
| ???? | 2 bytes | ??? | |
| ???? | 1 byte | bitmask | low order bit indicates variable columns |
| ???? | 1 byte | bitmask | See Column flags bellow |
| ???? | 2 bytes | offset_F | Offset for fixed length columns |
| ???? | 2 bytes | col_len | Length of the column (0 if memo) |
+-------------------------------------------------------------------------+
@@ -331,7 +342,7 @@ next_pg field.
+-------------------------------------------------------------------------+
| ???? | 4 bytes | index_num | Number of the index |
| | | |(warn: not always in the sequential order)|
| ???? | 4 bytes | index_num2 | Number of the index (repeat) |
| ???? | 4 bytes | index_num2 | Index into index cols list |
| 0x00 | 1 byte | ??? | |
| 0xFF | 4 bytes | ??? | |
| 0x00 | 4 bytes | ??? | |
@@ -343,8 +354,11 @@ next_pg field.
| ???? | 1 byte | idx_name_len| len of the name of the index |
| ???? | n bytes | idx_name | Name of the index |
+-------------------------------------------------------------------------+
| ???? | n bytes | ??? | |
| 0xFF | 2 bytes | ??? | End of the tableDef ? |
| Iterate while col_num != 0xffff |
+-------------------------------------------------------------------------+
| ???? | 2 bytes | col_num | Column number with variable length |
| ???? | 4 bytes | arg1 | |
| ???? | 4 bytes | arg2 | |
+-------------------------------------------------------------------------+
+-------------------------------------------------------------------------+
@@ -363,7 +377,7 @@ next_pg field.
| ???? | 2 bytes | num_var_cols| Number of variable columns in table |
| ???? | 2 bytes | num_cols | Number of columns in table (repeat) |
| ???? | 4 bytes | num_idx | Number of indexes in table |
| ???? | 4 bytes | num_real_idx| Number of indexes in table (repeat) |
| ???? | 4 bytes | num_real_idx| Number of index entries |
| ???? | 4 bytes | used_pages | Points to a record containing the |
| | | | usage bitmask for this table. |
| ???? | 4 bytes | free_pages | Points to a similar record as above, |
@@ -382,8 +396,8 @@ next_pg field.
| ???? | 2 bytes | col_num | Column Number (includes deleted columns) |
| ???? | 2 bytes | offset_V | Offset for variable length columns |
| ???? | 2 bytes | col_num | Column Number |
| ???? | 4 bytes | ??? | prec/scale? verify |
| ???? | 1 byte | bitmask | low order bit indicates variable columns |
| ???? | 4 bytes | ??? | prec/scale? or LCID (0x409=English)? |
| ???? | 1 byte | bitmask | See column flags bellow |
| ???? | 1 byte | ??? | seems to be 1 when variable len |
| 0000 | 4 bytes | ??? | |
| ???? | 2 bytes | offset_F | Offset for fixed length columns |
@@ -413,7 +427,7 @@ next_pg field.
| ???? | 4 bytes | unknown | matches first unknown definition block |
| ???? | 4 bytes | index_num | Number of the index |
| | | |(warn: not always in the sequential order)|
| ???? | 4 bytes | index_num2 | Number of the index (repeat) |
| ???? | 4 bytes | index_num2 | Index into index cols list |
| 0x00 | 1 byte | ??? | |
| 0xFF | 4 bytes | ??? | |
| 0x00 | 4 bytes | ??? | |
@@ -426,9 +440,24 @@ next_pg field.
| ???? | 2 bytes | idx_name_len| len of the name of the index |
| ???? | n bytes | idx_name | Name of the index (UCS-2) |
+-------------------------------------------------------------------------+
| ???? | n bytes | ??? | |
| 0xFF | 2 bytes | ??? | End of the tableDef ? |
| Iterate while col_num != 0xffff |
+-------------------------------------------------------------------------+
| ???? | 2 bytes | col_num | Column number with variable length |
| ???? | 4 bytes | arg1 | |
| ???? | 4 bytes | arg2 | |
+-------------------------------------------------------------------------+
Columns flags (not complete):
0x01: variable length column
0x02: can be null
0x04: is auto long
0x10: replication related field (or hidden?). These columns start with "s_" or
"Gen_" (the "Gen_" fields are for memo fields)
0x40: is auto guid
0x80: hyperlink. Syntax is "Link Title#http://example.com/somepage.html#" or
"#PAGE.HTM#"
In Access 2007 and Access 2010, "Complex Columns" (multivalued fields, version
history, attachments) always have the flag byte set to exactly 0x07.
Index flags (not complete):
0x01 Unique
@@ -444,11 +473,11 @@ Column Type may be one of the following (not complete):
MONEY = 0x05 /* Currency (64 bits) */
FLOAT = 0x06 /* Single (32 bits) */
DOUBLE = 0x07 /* Double (64 bits) */
SDATETIME = 0x08 /* Short Date/Time (64 bits) */
DATETIME = 0x08 /* Date/Time (64 bits) */
BINARY = 0x09 /* Binary (255 bytes) */
TEXT = 0x0A /* Text (255 bytes) */
OLE = 0x0B /* OLE */
MEMO = 0x0C /* Memo, Hyperlink */
OLE = 0x0B /* OLE = Long binary */
MEMO = 0x0C /* Memo = Long text*/
UNKNOWN_0D = 0x0D
UNKNOWN_0E = 0x0E
REPID = 0x0F /* GUID */