CARDFILE.EXE is a small, simple, database program that Microsoft included in Windows 3. It has a fixed database format consisting of a 39 character title, an optional text field of 440 characters, and a single optional OLE object. The title is also used as the sort key. The OLE objects can be pretty much anything that can be accessed via OLE -- a picture (BMP format only?), a sound, a program link, etc.
The cardfile program was ignored by most people, but a few people have found uses for it. Although CARDFILE was only shipped with Windows 3 (and Windows NT rel 3.51?), it runs fine with Windows 9, Me, 2000, and XP. It runs after a fashion on WINE under Linux but there are serious font problems and the OLE functions do not work. The WINE problems may or may not be insurmountable. (I'm looking into it Note, Aug 2010 -- never found a satisfactory answer).
The cardfile data format is extrordinarily complex for a rather simple program. For a wonder, Microsoft has actually documented the format https://support.microsoft.com/?scid=kb%3Ben-us%3B99340 Support Doc 99340. Regretably, the description document, while possibly correct, is quite incomprehensible. I had need to convert a cardfile data base to another format. I tried to use the MS documentation to do so and found that decoding the document was not really possible. At least I can't do it. So, I wrote my own description document
The cardfile file can be viewed as having three sections. There is a short header. The header is followed by a section containing the index data as a set of 52 byte records. The index data is followed by a set of variable length records containing object and/or text data for each index entry. The Windows 3 version of cardfile creates and uses eight bit ASCII encoding of text. The MS documentation implies that there is also a 16 bit Unicode varient of the file, but gives no real clue as to which fields are expanded to 16 bits per character, and whether or not there are other changes to the file format.
Section | Content |
---|---|
Header | Signature,Last Object ID,Number of Cards |
Indices | 39 character titles, pointer into Data section, and some trash |
Data | One Object and/or ASCII text field for each Index. In same order as Indices |
In the following discussion, all numbers describing the document are decimal counting from 1 unless otherwise stated. Pointers within the file are, of course binary, and presumably count from zero. Thus, for example, there is a data pointer (which will be binary, zero-based) in bytes 7-10 (decimal, counting from byte 1) of each index section entry. Technically, ASCII is a seven bit code with an eighth, undefined, leading bit. It is not clear whether the 8th (leading) bit of characters -- often used to encode non-ASCII characters -- is preserved. The MS documentation implies that it might not be, but it would take explicit effort not to preserve it, so it possibly is carried along. A quick (and not very thorough) test indicates that 8 bit are in fact preserved in both the title and text fields. The extra 128 characters may have to be input using right-Alt and the keypad. The exact character displayed will depend on the choice of code page used on the target PC.
Non-ASCII data storage appears to be in the usual unpleasant to work with Wintel Least Significant Byte First format.
Cardfile data files normally have the extension .CRD.
Known limits on cardfile. Most limits resulting from the format.
Value | Maximum |
---|---|
Index Text Size | 39 Bytes |
Data Text Size | 440 bytes (End of Line can be embedded, but each EOL consumes two bytes) |
Number of cards in file | 65535 |
Maximum length of the file | About 4.29GB, but the OS will very likely limit it to less |
The Cardfile header consists of a signature. This is normally the characters RRG, but MS implies that it could be DKO or MGC. It is unclear whether the signature has anything to do with the possible use of Unicode characters. The RRG signature files seem to contain one more four byte word in the header than do the MGC files. Aug 2010 See Afternote (somewhere below) for more information on MGC files
CARDFILE HEADER RECORD | ||
---|---|---|
Bytes | Contents | Notes |
1 thru 3 | The ASCII characters 'RRG' ('525247' hex) | Might be 'DKO' or 'MGC'. It's not clear which signatures the format described here applies to. |
4 thru 7 | ID of Last Object | It's not clear what, if anything, this is good for. I suspect that this might be used to create the next "Unique Object ID" in the Object data record by adding one to the current value. It appears not to be present in the MGC signature file format. |
8 and 9 and (10 and 11) | Number of 'cards' (entries) in the file | This is binary (of course) and appears to count from 1. And it's four bytes not two. |
(12-15) | Four bytes. Always 0?. Not documented |
The Index section contains one fixed length record (52 byte) for each card (entry) in the file. An Index entry consists of a 33 byte text description of the item, a 32 bit pointer to the rest of the data, and some trash. The cards presented to the user are ordered by the text description which is sorted as an ASCII string. The text descriptions are zero terminated strings and do not have to be a full 33 bytes. The remainder of the string will be filled with whatever was there previously -- a small security risk. The risk is small, not because failing to blank or zero fill the unused data space is a good idea, but because it isn't very likely that the inadvertantly exposed data will happen to be safe combinations, passwords, etc.
INDEX SECTION RECORD FORMAT | ||
---|---|---|
Bytes | Contents | Notes |
1 thru 6 | Unused bytes ('Reserved') | Should be '000000 hex' |
7 thru 10 | Pointer to Card's Data (i.e. object and/or data) | This appears to be binary relative to the start of the file. |
11 | 'Flag Byte' | This is always 0. I infer that the byte is probably not used |
12-51 | Index Text | ASCII Text--Zero terminated |
52 | Null Byte (always 0) | I infer this is here in order to make sure that an improperly terminated Index string doesn't cause trouble |
(53) | Null Byte? (always 0?) | There appears to be one additional undocumented zero byte here |
The cardfile data section consists of variable length records in the same order as the Index section. These records may contain an object and/or a text field (up to 440 bytes) or neither. The object, if it is present, is stored first. The object formats are apparently in some standard Microsoft format that can be passed to the appropriate Windows API logic simply by passing a pointer to the first byte. There are three different object formats. To further add to the fun, it is necessary to skip over the object (if it is present) in order to get to the text data. And none of the three object formats contains anything as mundane as an explicit indication of the length of the object.
Needless to say the "format" of the Data Section is tedious to describe and frustrating to deal with. I'm going to treat it as a series of records each of which contains three subrecords. e.g.
Data Section Record 1 | Data Section SubRecord 1 - Header |
Data Section SubRecord 1 - Object (Optional) | |
Data Section SubRecord 1 - Text | |
Data Section Record 2 | Data Section SubRecord 2 - Header |
Data Section SubRecord 2 - Object (Optional) | |
Data Section SubRecord 2 - Text | |
.... | .... |
The header for each data section record consists of a two byte flag which will be non-zero if an objectrecord is present. A text record is assummed to exist. If no text is present, the text record will consist of two zero bytes.
DATA SECTION SUBRECORD HEADER | ||
---|---|---|
Bytes | Contents | Notes |
1 and 2 | Object flag | 00hex = No Object Anything other than 00hex -- Object Present (So far, the only values I have seen here are 0 and 1) |
The optional object is an Embedded, Linked, or Static OLE object whose format is purportedly fully described in Appendix C to the "Object Linking and Embedding Programmer's Reference (Version 1.0). The format is also purportedly described in the Windows SDK "Programmer's Reference, Volume 1: Overview, Chapter 6, Object Storage Format". Since I, like most users, do not intend to use or alter the object object contents, I did not verify these references. However, I believe that the Windows SDK is available from Microsoft as a half gigabyte (compressed!) download.
What is presented here is what Microsoft talks about in their card file description modified by what I can observe in file dumps. This is just enough information to skip over, read, or restore an unaltered OLE object. It is possible that a properly formed call to OleLoadFromStream in the Windows API passing the appropriate pointer to the Object will read (?) and/or invoke the object.
The layout format of the object data given in MS99340 appears to contradict the format implied by the Algorithm's Section of the same document. I suspect that the two are not actually contradictory, but that in order to make them consistent, one needs to look at the world from some very unobvious perspective. Not being clairvoyant, I am using the Alogorithm implied layout which I (perhaps mistakenly) believe that I can understand. Strictly speaking, it's correct, but somewhat unhelpful because it fails to explicitly define a few minor things like existence and layout of the first twelve bytes of the object entry.
DATA SECTION RECORD -- OBJECT | |||
---|---|---|---|
Bytes | Contents | Notes | |
1 thru 4 | Unique Object ID | This is described as the "Unique Object ID". It seems to be set to a small 32 bit integer. I don't really know what this is good for. | |
5 thru 8 | OLE Version ID | Per MS99340 this is Version 1.0 -- '01 00'. In practice, it seems always to be 01 05 00 00. (I may have combined two 16 bit fields into one 32 bit number here) | |
9 thru 12 | Object Format | So far, I have been unable to create any format other than 2=Embedded Object. I suspect that in practice, Cardfile really can't create Linked or Static Objects. Therefore, I have described the Embedded Object format first then described the differences that can be expected in the other two formats.
Object Format:
This seems inconsistent with the encoding in the layout section (0 = embedded, 1 = linked, 2 = static). Look folks -- I didn't design this. I'm only the messenger here. | |
13 thru 16 | Length of the 'Class String' which turns out to be the number of characters in the string + 1 | ||
17 thru n where n = the length of the Class String + 1 | The Class String -- e.g. "Package" | The Class String is not only "counted" with a prefixed 32 bit byte count, but is terminated by a byte containing 0. The zero byte is included in the prefixed count | |
n+1 thru n+4 | Four Bytes of 00 | No clue what this is | |
n+5 thru n+8 | Four Bytes of 00 | No clue what this is | |
n+9 thru n+12 | Four Bytes of Object data size | Size of the Object Data in bytes | |
n+13 thru p | Object Data | This is specific to the object type. (i.e. an embedded link package is layed out quite differently internally from an embedded bitmap object but we know the length so we can skip over the content and its details.) | |
'Presentation Object' -- It's not 100% clear what a Presentation Object is, but it is a separate object that immedately follows each data object (Perhaps it is the icon displayed on the card for inserted objects) | |||
p thru p+3 | OLE Version ID | Per MS99340 this is Version 1.0 -- '01 00'. In practice, it seems always to be 01 05 00 00. (I may have combined two 16 bit fields into one 32 bit number here) OTOH, this field may not exist. | |
p+4 thru p+7 | 05 00 00 00 | At a guess this is the object type for a Presentation Object | |
p+8 thru p+11 | Length of the 'Class String' which turns out to be the number of characters in the string + 1 | ||
p+12 thru q where p = the length of the Class String + 1 | The Class String -- e.g. "Package" | The Class String is not only "counted" with a prefixed 32 bit byte count, but is terminated by a byte containing 0. The zero byte is included in the prefixed count | |
q thru q+3 | Four Bytes of Object data size | Size of the Object Data in bytes | |
q+4 thru r | Object Data | This is specific to the object type. (i.e. an embedded link package is layed out quite differently internally from an embedded bitmap object but we know the length so we can skip over the content and its details.) |
5 thru n (n=2 + the length of the Class Screen Text) | Class String | This appears to be a 'counted string' -- two bytes of length (LSB first) followed by non-zero terminated ASCII. The Class String seems to be a formalized description of the object: BITMAP, METAFILEPICT,...etc . |
At this point, I'm going to break out the three formats as separate tables. There is some additional logic embedded in the tables based on presentation object types that I did not use separate tables for. That's logically inconsistent, but I feared that the number of tables I'd end up with if I did that would add even more confusion to this horrendous data structure.
For Linked Object (Format =1) | ||
---|---|---|
n+1 thru m (where m=n+2+the length of the Network Name) | Network Name | This appears to be a 'counted string' -- two bytes of length (LSB first) followed by non-zero terminated ASCII. |
m+1 to m+2 | Network Type and Network Driver Version | Encoding of the information into the 16 bit value allocated is not specified in MS99340. |
m+3 to m+4 | Link Update Options | Encoding of the information into the 16 bit value allocated is not specified in MS99340. |
m+5 to p (where p=m+4+length of the presentation object) | Presentation Object | This appears to be a second object tacked onto the end of Linked and Embedded (but not Static) OLE objects. |
m+5 to m+8 | Presentation Object 'Unique ID' | The Presentation Object Version ID ('0100 hex') and Format (which, fortuitously, is ignorable when skipping the object). See bytes 1-4 description (above) for further information on the Unique ID. |
m+9 thru q (where q=m+11+length of this Class String) | Class String | This appears to be a 'counted string' -- two bytes of length (LSB first) followed by non-zero terminated ASCII. Unlike the Class string in bytes 5-n, we need to evaluate this value |
For Presentation Object (Embedded in Link or Embedded Object) with Class String=METAFILEPICT,BITMAP or DIB | ||
q+1 to q+2 | Character Width | Character width in mmhimetric (whatever that is) |
q+3 to q+4 | Character Height | Character height in mmhimetric |
q+5 to p (where p=q+7+length of Presentation Object) | Presentation Object | A 'counted variable' consisting of a 16 bit byte count and the object itself |
For Presentation Object (Embedded in Link or Embedded Object) with Class String other than METAFILEPICT,BITMAP or DIB | ||
q+1 to q+2 | Clipboard Format | A 16 bit integer. Encoding is not given in MS99340, but I infer that 0=NULL |
For Presentation Object (Embedded in Link or Embedded Object) with Class String other than METAFILEPICT,BITMAP or DIB and Clipboard format = NULL | ||
q+3 to p (where p=q+7+length of Presentation Object) | Presentation Object | A 'counted variable' consisting of a 16 bit byte count and the object itself |
For Presentation Object (Embedded in Link or Embedded Object) with Class String other than METAFILEPICT,BITMAP or DIB and Clipboard format not equal Null | ||
q+3 to r (where r=q+3+length of Clipboard Format Name) | Clipboard Format Name | A 'counted String' consisting of a 16 bit byte count and the non-zero terminated string. |
r to p (where p=r+2+length of Presentation Object) | Presentation Object | A 'counted variable' consisting of a 16 bit byte count and the object itself |
For Embedded Object (Format =2) | ||
---|---|---|
n+1 thru m (where m=n+2+the length of the Native Data) | Native Data | This appears to be a 'counted variable' -- two bytes of length (LSB first) followed by a block of data. |
m to p (where p=m+4+length of the presentation object) | Presentation Object | This appears to be a second object tacked onto the end of Linked and Embedded (but not Static) OLE objects. |
m to m+4 | Presentation Object 'Unique ID' | The Presentation Object Version ID ('0100 hex') and Format (which, fortuitously, is ignorable when skipping the object). See bytes 1-4 description (above) for further information on the Unique ID. |
m+5 thru q (where q=m+7+length of this Class String) | Class String | This appears to be a 'counted string' -- two bytes of length (LSB first) followed by non-zero terminated ASCII. Unlike the Class string in bytes 5-n, we need to evaluate this value |
For Presentation Object (Embedded in Link or Embedded Object) with Class String=METAFILEPICT,BITMAP or DIB | ||
q+1 to q+2 | Character Width | Character width in mmhimetric (whatever that is) |
q+3 to q+4 | Character Height | Character height in mmhimetric |
q+5 to p (where p=q+7+length of Presentation Object) | Presentation Object | A 'counted variable' consisting of a 16 bit byte count and the object itself |
For Presentation Object (Embedded in Link or Embedded Object) with Class String other than METAFILEPICT,BITMAP or DIB | ||
q+1 to q+2 | Clipboard Format | A 16 bit integer. Encoding is not given in MS99340, but I infer that 0=NULL |
For Presentation Object (Embedded in Link or Embedded Object) with Class String other than METAFILEPICT,BITMAP or DIB and Clipboard format = NULL | ||
q+3 to p (where p=q+7+length of Presentation Object) | Presentation Object | A 'counted variable' consisting of a 16 bit byte count and the object itself |
For Presentation Object (Embedded in Link or Embedded Object) with Class String other than METAFILEPICT,BITMAP or DIB and Clipboard format not equal Null | ||
q+3 to r (where r=q+3+length of Clipboard Format Name) | Clipboard Format Name | A 'counted String' consisting of a 16 bit byte count and the non-zero terminated string. |
r to p (where p=r+2+length of Presentation Object) | Presentation Object | A 'counted variable' consisting of a 16 bit byte count and the object itself |
For Static Object (Format =3) | ||
---|---|---|
n+1 thru m (where m=n+2+the length of the Native Data) | Native Data | This appears to be a 'counted variable' -- two bytes of length (LSB first) followed by a block of data. |
m to p (where p=m+4+length of the presentation object) | Presentation Object | This appears to be a second object tacked onto the end of Linked and Embedded (but not Static) OLE objects. |
m to m+4 | Presentation Object 'Unique ID' | The Presentation Object Version ID ('0100 hex') and Format (which, fortuitously, is ignorable when skipping the object). See bytes 1-4 description (above) for further information on the Unique ID. |
m+5 thru q (where q=m+7+length of this Class String) | Class String | This appears to be a 'counted string' -- two bytes of length (LSB first) followed by non-zero terminated ASCII. Unlike the embedded and link formats, we can ignore this value because we seem to know that a presentation value follows |
q+1 to q+2 | Character Width | Character width in mmhimetric |
q+3 to q+4 | Character Height | Character height in mmhimetric |
q+5 to p (where p=q+7+length of Presentation Object) | Presentation Object | A 'counted variable' consisting of a 16 bit count and the object itself |
DATA SECTION SUBRECORD -- TEXT | ||
---|---|---|
Bytes | Contents | Notes |
1 and 2 | Text Byte Count | 0 to 440. 0 indicates no text |
3 to x (maximum 442) | Text | The card's text field. Up to 440 bytes. End Of Lines (CR-LF) can be embedded in the text. |
* This describes the "MGC" style of .CRD file. All values are little endian. * File contains three sections: header, index, and data. * Header is 11 bytes, Index is 52 bytes times number of entries in file, * appearing in physical sort order, and Data is remainder of file. * * 0 - 2 MGC signature * 3 - 6 Number of cards in file (1250 / 4E2 (4 226) in Chuck's case) * 7 - 10 Four bytes of indeterminate meaning, 0 in my single example. * 11 Beginning of Index Entry table: Each entry is 52 bytes long: * +0 - +3 Absolute byte offset to data entry for this index entry. * +4 - +51 Null terminated string used for sort. (seems like a lot of wasted space), but I don't know what it is. * Each Data Entry: * +0 - +1 No idea what these bytes mean - both are null in my single example. * +2 - +3 Length of data in bytes - my example contains ascii characters, where each line is terminated by 13 10. * +4 - +n Data * From Steve Metter, Dayton, OH, August 2010