I recently want to develop a program. The data source is DBF. I used Java to parse it and found that there was a problem. I found the relevant information online, I found that the structure of this document would be quite detailed, and it would help me a lot. Reprinted from: http://blog.csdn.net/wyp_810618/archive/2010/04/14/5485212.aspx
I would like to express my gratitude to the author of the original article for the convenience of searching. The following is the body content.
DBF file structure:
Table files consist of header records and data records. The header record defines the structure of the table and contains other table-related information. The header record starts with the file position 0. Data Record 1 is followed by the header record (consecutive bytes) and contains the actual text in the field. The record length (in bytes) is equal to the sum of the length defined by all fields. When an integer is stored in a table file, the low byte is in front of the table file. 1. Structure of the table header record: Byte offset description 0 file type 0 × 02foxbase 0 × 03foxbase +/dbase iii Plus, no remarks 0 × 30visual FoxPro 0 × 43dbase iv SQL table file with no remarks 0 × 63dbase iv SQL System File with no remarks 0 × 83foxbase +/dbase iii Plus, with remarks 0 × 8bdbase IV remarks 0 xcbdbase iv SQL table file with remarks 0xf5foxpro 2.x( or earlier) has remarks 0 xfbfoxbase 1-3 last update time (yymmdd) 4-7 the number of records in the file 8-9 location of the first data record 10-11 length of each data record (including Delete tag) 12-27 Reserved Marking of 28 tables 0 × 01 files with. CDX Structure 0 × 02 file contains remarks. 0 × 04 files are databases (. DBC) Note that this byte can contain any sum of the upper nominal values. For example, 0 × 03 indicates that the table has a structured. CDX and a remarks field. 29 code page mark 30-31 reserved, including 0x00 32-N field sub-record The number of fields determines the number of Field subrecords. Each field in the table corresponds to a field subrecord. The n + 1 header record terminator (0 × 0d). The 264 bytes in the range of N + 2 to N + 263 contain the post-Chain information (related database (. ). If the first byte is 0 × 00, the file is not associated with the database. Therefore, database files are always 0 × 00. 1 The 8th to 9th bytes in the header record indicate the starting position of the data in the data file. The data record starts from the mark byte. If this byte is ASCII Space (0 × 20), the record is not deleted. If the first byte is an asterisk (0 × 2a), the record has a deleted mark. After marking, it is the data in the fields named in the field record. 2. Field subrecord Structure Byte offset description 0-10 Field Names (up to 10 characters-if less than 10 characters are entered with null characters (0 × 00) 11 Field Type C-memory type Y-currency type N-numeric type F-floating point type D-date type T-Date and Time Type B-Double Precision type I-integer L-logical type M-remark type G-General Type C-bytes (Binary) M-remarks (Binary) P-image type The offset of this field in the 12-15 Record 16 field length (in bytes) 17 decimal places 18 field mark 0 × 01 system column (invisible to Users) 0 × 02 columns that store null values 0 × 04 binary columns (applicable only to attention and remarks) 19-32 Reserved Format: Supports null values. Date-time, currency, and double-precision data The character field and remarks field are marked as binary Add a table to the database (. DBC) File The following formula is used to obtain the number of fields in the table file: (x-296/32) in the formula, X indicates the position of the first record (8th to 9th bytes of the header record ), 296 indicates 263 (post-link information) + 1 (header record Terminator) + 32 (first field subrecord), and 32 indicates the length of the field subrecord. |
File header data structure description (C and Pascal) Description
Because the records of DBF Files are stored in the file data part in ASCII code, you only need to read the content of the file header and field type description area, you can directly read each record in the DBF file, and the DBF File Header structure and field type description.
File header data structure description (C and Pascal)
C description Because the records of DBF Files are stored in the file data part in ASCII code, you only need to read the content of the file header and field type description area, you can directly read each record in the DBF file. The DBF File Header structure and field type description structure are represented in the following C language: Struct dbf_head {/* DBF File Header structure */ Char vers;/* version mark */ Unsigned char YY, mm, DD;/* Last updated year, month, and day */ Unsigned long no_recs;/* Total number of records contained in the file */ Unsigned short head_len, rec_len;/* file header length, record length */ Char reserved [20];/* Reserved */ }; Struct field_element {/* field description structure */ Char field_name [11];/* field name */ Char field_type;/* field type */ Unsigned long offset;/* offset */ Unsigned char field_length;/* field length */ Unsigned char field_decimal;/* length of the integer part of the Floating Point Number */ Char reserved1 [2];/* Reserved */ Char dbaseiv_id;/* dbase iv Work Area ID */ Char reserved2 [10];/* Char production_index; }; Yes Note that the input DBF file is FOXPRO 2.5 for DOS/windows, and the file header indicates the unsigned of the number of records and other content. Long and unsigned short fields, the addressing order is from high to low; while C Programs are compiled under the HP-UX operating system, the HP Server uses The addressing sequence of the CPU is the opposite to that of the intel X86 series CPU. It ranges from low to high. Therefore, the unsigned long and unsigned Short performs reverse operations, which can be programmed using bit operations: Void revert_unsigned_short (unsigned short *) { Unsigned short left, right; Left = right = *; * A = (left & 0 × 00FF) <8) | (Right & 0xff00)> 8 ); } Void revert_unsigned_long (unsigned long *) { Unsigned long first, second, third, forth; First = Second = Third = forth = *; * A = (first & 0 × 000000ff) <24) | (Second & 0 × 0000ff00) <8) | (Third & 0 × 00ff0000)> 8) | (Forth & 0xff000000)> 24 ); } This is what I described using passcal: Tdbf_head = packed record Vers: Char; // version flag YY, mm, DD: byte; // last updated year, month, and day No_recs: longword; // total number of records contained in the file; Head_len, rec_len: word; // file header length, record length Reserved: array [0 .. 19] of char; End; Tfield_element = record // field description Structure Field_name: array [0 .. 10] of char; // field name Field_type: Char; // Field Type Offset: longword; // offset Field_length: byte; // Field Length Field_decimal: byte; // length of the integer part of the Floating Point Number Reserved1: array [0 .. 1] of char; // Reserved Dbaseiv_id: Char; // dbase iv Work Area ID Reserved2: array [0 .. 9] of char ;// Production_index: Char; End; |
Appendix:
Source code:
The structure is represented in C language as follows:
Struct dbf_head {/* DBF File Header structure */
Char vers;/* version mark */
Unsigned char YY, mm, DD;/* Last updated year, month, and day */
Unsigned long no_recs;/* Total number of records contained in the file */
Unsigned short head_len, rec_len;/* file header length, record length */
Char reserved [20];/* Reserved */
};
Struct field_element {/* field description structure */
Char field_name [11];/* field name */
Char field_type;/* field type */
Unsigned long offset;/* offset */
Unsigned char field_length;/* field length */
Unsigned char field_decimal;/* length of the integer part of the Floating Point Number */
Char reserved1 [2];/* Reserved */
Char dbaseiv_id;/* dbase iv Work Area ID */
Char reserved2 [10];/*
Char production_index;
};
Yes
Note that the input DBF file is FOXPRO 2.5 for DOS/windows, and the file header indicates the unsigned of the number of records and other content.
Long and unsigned short fields, the addressing order is from high to low; while C Programs are compiled under the HP-UX operating system, the HP Server uses
The addressing sequence of the CPU is the opposite to that of the intel X86 series CPU. It ranges from low to high. Therefore, the unsigned long and unsigned
Short performs reverse operations, which can be programmed using bit operations:
Void revert_unsigned_short (unsigned short *)
{
Unsigned short left, right;
Left = right = *;
* A = (left & 0 × 00FF) <8) | (Right & 0xff00)> 8 );
}
Void revert_unsigned_long (unsigned long *)
{
Unsigned long first, second, third, forth;
First = Second = Third = forth = *;
* A = (first & 0 × 000000ff) <24) |
(Second & 0 × 0000ff00) <8) |
(Third & 0 × 00ff0000)> 8) |
(Forth & 0xff000000)> 24 );
}
This is what I described using passcal:
Tdbf_head = packed record
Vers: Char; // version flag
YY, mm, DD: byte; // last updated year, month, and day
No_recs: longword; // total number of records contained in the file;
Head_len, rec_len: word; // file header length, record length
Reserved: array [0 .. 19] of char;
End;
Tfield_element = record // field description Structure
Field_name: array [0 .. 10] of char; // field name
Field_type: Char; // Field Type
Offset: longword; // offset
Field_length: byte; // Field Length
Field_decimal: byte; // length of the integer part of the Floating Point Number
Reserved1: array [0 .. 1] of char; // Reserved
Dbaseiv_id: Char; // dbase iv Work Area ID
Reserved2: array [0 .. 9] of char ;//
Production_index: Char
End;