Today, the CSV data downloaded from Kaggle to MySQL, want to do a statistical analysis, Zennai CSV file is a bit large, so can only be imported with MySQL command, now mark, for future contingencies:
1. Import:
基本语法:load data [low_priority] [local] infile ‘file_name txt‘ [replace | ignore]into table tbl_name[character set gbk][fields[terminated by‘t‘][OPTIONALLY] enclosed by ‘‘][escaped by‘\‘ ]][lines terminated by‘n‘][ignore number lines][(col_name, )]
Import command _example:
load data infile ‘csv文件路径\\test.csv‘ replace into table 表名 fields terminated by ‘,‘ optionally enclosed by ‘"‘ lines terminated by ‘\n‘ ignore 1 lines(Id,@dummy,DayOfWeek,PdDistrict,Address,X,Y);
Do not underestimate this simple example, contains a lot of things;
The first line is the import file;
In the second line of syntax, you will find two words: Replace and ignore. The replace and ignore keywords control the repetitive processing of existing unique key records. If you specify replace, the new row replaces the existing rows that have the same unique key value. If you specify ignore, skip the input of the duplicate rows of existing rows that have unique keys. If you do not specify any of the options, an error occurs when a duplicate key is found, and the remainder of the text file is ignored.
So I think the reality of your table design primary key, or best to not repeat the field;
The third to fourth line is simply that each specific field content is separated by a comma, separated by commas.
Erminated by describes the delimiter for the field, which by default is the tab character (\ t)
Enclosed by describes the enclosed character of a field, meaning that if there is a quotation mark in the field, it is considered part of the field.
Another of the syntax is escaped by, which describes the escape character. The default is a backslash (backslash:\)
Row five lines terminated by is the division of each line, here to pay attention to a problem, if the CSV file is generated under Windows, that partition with ' \ r \ n ', Linux with ' \ n '.
Ignore 1 lines in line six ignores the first line, because the first line is often the field name, and there is a field in the parentheses behind it is very special @dummy, it means that if there is a field in the CSV file I do not want to plug in, then the corresponding field name into @dummy.
PS: Want to insert the import time, at the end add set Update_time=current_timestamp;
2. Export:
Also comes with the Export command:
select * from 表名into outfile ‘导出路径\\test.csv‘ fields terminated by ‘,‘ optionally enclosed by ‘"‘ escaped by ‘"‘ lines terminated by ‘\n‘;
Hope to communicate more!
Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.
CSV Bulk Import mysql command