How can I solve the problem of a special character error when importing data from infobright?
Currently, the Community version of infobright is used. DML is not supported and DATA can only be imported using load data.
If the metadata contains special control characters, errors are often reported during the import process, which is annoying. There are two methods to cope with the policy:
1. Before setting Reject File Import, set @ BH_REJECT_FILE_PATH and @ BH_ABORT_ON_COUNT to ignore the number of failed import records and save these records in the specified File.
Copy codeThe Code is as follows:/** when the number of rows rejected reaches 10, abort process **/
Set @ BH_REJECT_FILE_PATH = '/tmp/reject_file ';
Set @ BH_ABORT_ON_COUNT = 10;
If BH_ABORT_ON_COUNT is set to-1, it indicates that it will never be ignored.
You can also set the BH_ABORT_ON_THRESHOLD option, which indicates the maximum percentage of data allowed to be ignored. Therefore, the value of this option is in decimal format, for example, BH_ABORT_ON_THRESHOLD = 0.03 (3%)
2. Specify the terminator when exporting data. In addition, you can specify the terminator when exporting data, and specify escape characters (such as \, ", and ') to be ignored. For example:
Copy codeThe Code is as follows: select fields_list... into outfile '/tmp/outfile.csv' fields terminated by '| 'escaped by' \ 'Lines terminated BY' \ r \ n' from mytable;
3. Alternatively, set the line delimiter to another special identifier, for example, select fields_list... Into outfile '/tmp/outfile.csv' fields terminated by '| 'escaped by' \ 'Lines terminated BY' $ \ r \ n' from mytable; of course, in this case, the value "$ \ r \ n" cannot exist in the actual data row. Otherwise, it will be regarded as a line feed identifier.
How to import a table from oracle to infobright
For DW systems, large data migration costs are high. Therefore, the import and export speed and adequacy are also important criteria for considering data warehouse products. Infobright is based on MySQL, so there is a more molding solution in the data format. The IB factory optimized the speed. The DLP distributed import option was introduced in the 4.0 Enterprise Edition, which greatly reduced the migration time. At present, JDSU, the world's largest optical communication provider, also selected the IB product and configured it with DLP as the main option. However, this article mainly introduces the basic import and export of IB, as well as the special problems and solutions encountered in the project. (At the end, huh, huh)
1. Introduction
IB provides dedicated high-performance loader, different from traditional mysql. IB loader is designed to improve the import speed. Therefore, it only supports the unique mysqlloader syntax and only supports importing formatted variables and text source files.
IEE also supports mysqlloader and insert statements
2. Default Loader
ICE only supports IB lorder
By default, IEE uses mysql loader, which can be more fault-tolerant, but is slower. For the fastest import, use IB loader to set the following Environment
Mysql>
Set @ bh_dataformat = 'txt _ variable ';
-Use IB loader to import variable fixed-length text in CSV format
Set @ bh_dataformat = 'binary ';
-Binary files
Set @ bh_dataformat = 'mysql ';
-Use mysql loader
3. IB loader syntax
IB only supports load data infile. Other mysql import methods are not supported.
Load data infile '/full_path/file_name'
Into table tbl_name
[FIELDS
[Terminated by 'Char ']
[Enclosed by 'Char ']
[Escaped by 'Char ']
];
Disable before import
Set AUTOCOMMIT = 0;
After completion
COMMIT;
Set AUTOCOMMIT = 1;
4, Regional Separator
The. region separator is optional and is set
CLAUSE DEFAULT VALUE
Fields terminated by ';' (semicolon)
Fields enclosed by '"' (double quote)
Fields escaped by "(none)
5. Import experience
A. Modify LoaderMainHeapSize in brighthouse. ini when the number of columns in the imported table is large.
B. Concurrent Import
The c-Consistency sorting is txt_variables <binary <mysql
D bh_loader does not support multiple separators
E. It is necessary to select DLP when there is a large amount of data.
How to Use the utf8 character set in infobright
At this time, if you need the brighthouse engine to support the utf8 character set, you need to: 1. you must use the utf8 character set when creating database objects. This is especially important; otherwise, utf8 is not supported; 2. the utf8 character set is also used when data table objects are created. 3. the import file is converted to the utf8 character set in advance. 4. execute set names utf8; 5. import the file to check whether the character set is correct. In another scenario, the myisam Table may also need to support utf8, which is relatively troublesome: 1. it doesn't matter when the database object is created. It is not mandatory to be utf8; 2. when creating a data table object, you must use the utf8 character set;