The following articles describe how to correctly understand the external file formats supported by the DB2 database, one of the tasks in DB2 maintenance may often need to be handled by all of us. For example, before the database is put into production and use, a large amount of information needs to be created in the database.
This step is done by importing data directly from a file to the database system. However, prior to data import, the DB2 database system often checks the format of external files carefully.
If the format of the external file does not meet the mandatory requirements of the DB2 database, the database will refuse to import data. This article will introduce the external file formats supported by the DB2 database. We hope this analysis can help the database administrator reduce the workload during data import and export.
1. ASCII file.
It is widely used in the database field. Basically, any database supports external files in this format. However, each database has different requirements on the delimiter used to define ASCII files. Improve the efficiency of DB2 data backup and recovery in DB2 databases) The Delimiter is easy to use to identify the start position and end position of data elements. Generally, DB2 databases support the following separators. The first is the character separator.
This character Delimiter is used to define the starting position of the character data type. By default, double quotation marks are used as character separators in DB2 databases. This is different from other databases. For example, in Oracle databases, when using a bounded ASCII file to import data, it will automatically determine whether the data is a balanced data type, that is, this character separator is not required. The second is the column separator, which is used to define the start position of a field.
In most cases, comma (in English) is used as the column separator. However, you can set different column separators, such as colons or semicolons, as needed. No matter which symbol is used as the column separator, a prerequisite must be met, that is, the symbol in the English input stage. If it is a comma or semicolon in Chinese, it is treated as a common character. The third is the line separator.
The row delimiter is a special delimiter. Because the operating system platform is different, it uses different row delimiters. For example, in Windows, the line separator is a normal line break. In Linux and other operating systems, although it is also a line break, the line breaks are different on the two operating systems. If you open the ASCII file created on a Windows operating system on a Linux operating system, the format will be messy.
If all records are displayed in the same row, it seems that there are no branches. Therefore, when importing data, if the demarcation file is created on the Windwos operating system, and the DB2 database is not on the Linux operating system, the data cannot be imported to the DB2 database. Instead, you must first convert the format to the line break that can be recognized by the Linux operating system. The database administrator must pay enough attention to this issue. How to update the execution plan in the DB2 database)
According to the above rules, the following records are the formats supported by the DB2 database:
- 1001,”office”,2323,”BeiJing”,-233
Note: If the value of a string in an external file contains column delimiters (for example, comma-column delimiters), this string Delimiter is required. That is to say, if the data to be imported is generic data, it must be enclosed in double quotation marks. If there is no column separator, this string separator is optional. For example, if the string Bei and Jing are not enclosed in double quotation marks, the DB2 database system considers it to be two fields rather than a string.
In this case, you need to use the character separator to reference it and tell the database system that it is a string. Although this bounded ASCII file can be used to convert most data types, such as numeric data, numeric data, and date data. However, if an external file contains binary data, it is not a good choice to use this ASCII file.
In addition, if the platform created by the file is different from the operating system platform used by the platform where the DB2 database is located, the ASCII demarcation file is not created. In this case, the author establishes an external file using the PC integrated exchange format. Note that there is no limit on the maximum length of the Data row in the bounded ASCII file. However, the maximum length of values in a separate column cannot exceed 32 KB. It is only from the perspective of ASCII files. When importing files to the database, you must also note whether the field size in the DB2 database table can accommodate the field values in this external file.
2. Fixed-record-length ASCII files.
DB2 database also supports fixed-length ASCII external file formats. Sometimes, files in this format are called location ASCII files. The ASCII file format at the fixed record length position uses a fixed byte offset to replace the record separator to determine the boundary of each field. In short, in this file format, each record in the file (each field in the record) has a fixed length. For example, the starting position of the N record is determined by the fixed length of the record multiplied by the product of N.
The calculated value is the byte offset from the start point of the file. Not only is the length of the record row fixed, but the start position and end position of each column value in each record are also the same. That is, the length of each field is the same. However, if the actual content of a field does not reach this fixed width, it is often replaced by a space character.
However, this file format is much different from the previously mentioned ASCII file. Such as making this file is very troublesome and prone to errors. Inputting one or more spaces leads to Database Import errors. For example, Chinese characters and English characters occupy different positions. If the data file contains both Chinese and English characters, it is difficult to control the fixed length.
Therefore, like many database administrators, I do not set up the fixed-length ASCII file format. If you must use it, you must use professional tools to convert the file. For example, in Linux, you can use the redirection symbol to create a fixed-width ASCII file. In short, if you create this file manually, there is a high possibility of errors. In the current DB2 database system, some import tools also support this format of external files. However, this format has been abandoned during data export. It can be seen that files in this format are exiting the historical stage.
3. binary files.
These two types of files share a common feature, that is, they are not stored in binary format. For this reason, they will encounter a compatibility file during the application. For example, a table is exported as an ASCII file on the Linux operating system. On the Windows operating system platform, the file cannot be imported to the DB2 database system. It is mainly because ASCII files are stored in different operating systems (such as Linux or Windows) in different formats.
Therefore, it is impossible to export data on different operating systems through ASCII files. In fact, not only does the DB2 database have such troubles, but also does the Oracle database.
To solve the problem, an external file in binary format is involved in the DB2 database. There are two types of data files in the binary format: PC integrated exchange format and work form format. PC integrated exchange format files are mainly used for data transmission between DB2 database tables on different operating systems, such as Linux and Windows operating systems.
That is, the binary file in this format can be used to import data from the database on the Linux operating system, and then import the data to the DB2 database on the Windows operating system. This file stores not only recorded content, but also table structure definitions. In addition, files in binary format can be easily transferred on different operating systems without worrying about compatibility issues.
Another binary file format is the work form format. This is also a special format file for DB2 database products. The purpose of this work form format file is different from that of the PC integrated Interchange Format File. The PC integrated exchange format file is mainly used to solve the compatibility issues during data migration on different operating system platforms.
This work form format file is mainly used to migrate data between different DB2 products, such as data migration between DB2 database and Lotus products. Although they are also binary files, they are used differently. For this reason, the database administrator needs to select the appropriate file format based on different purposes.