A detailed description of the DBF File Format

Last Update:2018-12-05 Source: Internet

Author: User

Tags dbase hp server

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

I recently want to develop a program. The data source is DBF. I used Java to parse it and found that there was a problem. I found the relevant information online, I found that the structure of this document would be quite detailed, and it would help me a lot. Reprinted from: http://blog.csdn.net/wyp_810618/archive/2010/04/14/5485212.aspx

I would like to express my gratitude to the author of the original article for the convenience of searching. The following is the body content.

DBF file structure:

Table files consist of header records and data records. The header record defines the structure of the table and contains other table-related information. The header record starts with the file position 0. Data Record 1 is followed by the header record (consecutive bytes) and contains the actual text in the field.
The record length (in bytes) is equal to the sum of the length defined by all fields. When an integer is stored in a table file, the low byte is in front of the table file.
1. Structure of the table header record:
Byte offset description
0 file type
0 × 02foxbase
0 × 03foxbase +/dbase iii Plus, no remarks
0 × 30visual FoxPro
0 × 43dbase iv SQL table file with no remarks
0 × 63dbase iv SQL System File with no remarks
0 × 83foxbase +/dbase iii Plus, with remarks
0 × 8bdbase IV remarks
0 xcbdbase iv SQL table file with remarks
0xf5foxpro 2.x( or earlier) has remarks
0 xfbfoxbase
1-3 last update time (yymmdd)
4-7 the number of records in the file
8-9 location of the first data record
10-11 length of each data record (including Delete tag)
12-27 Reserved
Marking of 28 tables
0 × 01 files with. CDX Structure
0 × 02 file contains remarks.
0 × 04 files are databases (. DBC)
Note that this byte can contain any sum of the upper nominal values. For example, 0 × 03 indicates that the table has a structured. CDX and a remarks field.
29 code page mark
30-31 reserved, including 0x00
32-N field sub-record
The number of fields determines the number of Field subrecords. Each field in the table corresponds to a field subrecord.
The n + 1 header record terminator (0 × 0d). The 264 bytes in the range of N + 2 to N + 263 contain the post-Chain information (related database (. ). If the first byte is 0 × 00, the file is not associated with the database. Therefore, database files are always 0 × 00.
1
The 8th to 9th bytes in the header record indicate the starting position of the data in the data file. The data record starts from the mark byte. If this byte is ASCII Space
(0 × 20), the record is not deleted. If the first byte is an asterisk (0 × 2a), the record has a deleted mark. After marking, it is the data in the fields named in the field record.
2. Field subrecord Structure
Byte offset description
0-10 Field Names (up to 10 characters-if less than 10 characters are entered with null characters (0 × 00)
11 Field Type
C-memory type
Y-currency type
N-numeric type
F-floating point type
D-date type
T-Date and Time Type
B-Double Precision type
I-integer
L-logical type
M-remark type
G-General Type
C-bytes (Binary)
M-remarks (Binary)
P-image type
The offset of this field in the 12-15 Record
16 field length (in bytes)
17 decimal places
18 field mark
0 × 01 system column (invisible to Users)
0 × 02 columns that store null values
0 × 04 binary columns (applicable only to attention and remarks)
19-32 Reserved
Format:
Supports null values.
Date-time, currency, and double-precision data
The character field and remarks field are marked as binary
Add a table to the database (. DBC) File
The following formula is used to obtain the number of fields in the table file: (x-296/32) in the formula, X indicates the position of the first record (8th to 9th bytes of the header record ), 296 indicates 263 (post-link information) + 1 (header record Terminator) + 32 (first field subrecord), and 32 indicates the length of the field subrecord.

File header data structure description (C and Pascal) Description

File header data structure description (C and Pascal)

C description

Because the records of DBF Files are stored in the file data part in ASCII code, you only need to read the content of the file header and field type description area, you can directly read each record in the DBF file. The DBF File Header structure and field type description structure are represented in the following C language:
Struct dbf_head {/* DBF File Header structure */
Char vers;/* version mark */
Unsigned char YY, mm, DD;/* Last updated year, month, and day */
Unsigned long no_recs;/* Total number of records contained in the file */
Unsigned short head_len, rec_len;/* file header length, record length */
Char reserved [20];/* Reserved */
};
Struct field_element {/* field description structure */
Char field_name [11];/* field name */
Char field_type;/* field type */
Unsigned long offset;/* offset */
Unsigned char field_length;/* field length */
Unsigned char field_decimal;/* length of the integer part of the Floating Point Number */
Char reserved1 [2];/* Reserved */
Char dbaseiv_id;/* dbase iv Work Area ID */
Char reserved2 [10];/*
Char production_index;
};
Yes
Note that the input DBF file is FOXPRO 2.5 for DOS/windows, and the file header indicates the unsigned of the number of records and other content.
Long and unsigned short fields, the addressing order is from high to low; while C Programs are compiled under the HP-UX operating system, the HP Server uses
The addressing sequence of the CPU is the opposite to that of the intel X86 series CPU. It ranges from low to high. Therefore, the unsigned long and unsigned
Short performs reverse operations, which can be programmed using bit operations:
Void revert_unsigned_short (unsigned short *)
{
Unsigned short left, right;
Left = right = *;
* A = (left & 0 × 00FF) <8) | (Right & 0xff00)> 8 );
}
Void revert_unsigned_long (unsigned long *)
{
Unsigned long first, second, third, forth;
First = Second = Third = forth = *;
* A = (first & 0 × 000000ff) <24) |
(Second & 0 × 0000ff00) <8) |
(Third & 0 × 00ff0000)> 8) |
(Forth & 0xff000000)> 24 );
}

This is what I described using passcal:

Tdbf_head = packed record
Vers: Char; // version flag
YY, mm, DD: byte; // last updated year, month, and day
No_recs: longword; // total number of records contained in the file;
Head_len, rec_len: word; // file header length, record length
Reserved: array [0 .. 19] of char;
End;
Tfield_element = record // field description Structure
Field_name: array [0 .. 10] of char; // field name
Field_type: Char; // Field Type
Offset: longword; // offset
Field_length: byte; // Field Length
Field_decimal: byte; // length of the integer part of the Floating Point Number
Reserved1: array [0 .. 1] of char; // Reserved
Dbaseiv_id: Char; // dbase iv Work Area ID
Reserved2: array [0 .. 9] of char ;//
Production_index: Char;
End;

Appendix:

Source code:

The structure is represented in C language as follows:
Struct dbf_head {/* DBF File Header structure */
Char vers;/* version mark */
Unsigned char YY, mm, DD;/* Last updated year, month, and day */
Unsigned long no_recs;/* Total number of records contained in the file */
Unsigned short head_len, rec_len;/* file header length, record length */
Char reserved [20];/* Reserved */
};
Struct field_element {/* field description structure */
Char field_name [11];/* field name */
Char field_type;/* field type */
Unsigned long offset;/* offset */
Unsigned char field_length;/* field length */
Unsigned char field_decimal;/* length of the integer part of the Floating Point Number */
Char reserved1 [2];/* Reserved */
Char dbaseiv_id;/* dbase iv Work Area ID */
Char reserved2 [10];/*
Char production_index;
};
Yes
Note that the input DBF file is FOXPRO 2.5 for DOS/windows, and the file header indicates the unsigned of the number of records and other content.
Long and unsigned short fields, the addressing order is from high to low; while C Programs are compiled under the HP-UX operating system, the HP Server uses
The addressing sequence of the CPU is the opposite to that of the intel X86 series CPU. It ranges from low to high. Therefore, the unsigned long and unsigned
Short performs reverse operations, which can be programmed using bit operations:
Void revert_unsigned_short (unsigned short *)
{
Unsigned short left, right;
Left = right = *;
* A = (left & 0 × 00FF) <8) | (Right & 0xff00)> 8 );
}
Void revert_unsigned_long (unsigned long *)
{
Unsigned long first, second, third, forth;
First = Second = Third = forth = *;
* A = (first & 0 × 000000ff) <24) |
(Second & 0 × 0000ff00) <8) |
(Third & 0 × 00ff0000)> 8) |
(Forth & 0xff000000)> 24 );
}

This is what I described using passcal:

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More