Today, the csv data downloaded from Kaggle is imported to mysql. if you want to make a statistical analysis, the csv file is a little large. Therefore, you can only use the mysql command to import the data. now you can mark it for future reference: 1. import: Basic syntax: loaddata [low_priority] [local] Today, the csv data downloaded from Kaggle is imported into mysql. to do a statistical analysis, how can the csv file be large, therefore, you can only use the mysql command to import the data. now you can mark it for future use:
1. import:
Basic syntax: load data [low_priority] [local] infile 'File _ name txt '[replace | ignore] into table tbl_name [character set gbk] [fields [terminated by 't'] [OPTIONALLY] enclosed by ''] [escaped by '\'] [lines terminated by 'N'] [ignore number lines] [(col_name,)]
Import command _ example:
Load data infile 'csv file path \ test.csv 'replace into table name fields terminated ', 'optionally enclosed by '"'lines terminated by' \ n' ignore 1 lines (Id, @ dummy, DayOfWeek, PdDistrict, Address, X, Y );
Don't underestimate this simple example, which contains many things;
The first line is to import files;
When you see the syntax in the second line, you will find two words: replace and ignore.ReplaceAndIgnoreKeyword control repeat the existing unique key record. If you specifyReplace, The new line will replace the existing row with the same unique key value. If you specifyIgnoreTo skip the input of duplicate rows in an existing row with a unique key. If you do not specify any option, an error occurs when the duplicate key is found, and the remaining part of the text file is ignored.
So I think in reality, it is better for your table to design primary keys without repeating fields;
Third ~ The four rows are separated by commas.
The delimiter used by erminated by to describe the field. it is a tab character (\ t) by default)
Enclosed by describes the starting character of a field, that is, if a field contains quotation marks, it is considered as part of the field.
In the syntax, there is also escaped by, which describes escape characters. The default value is the backslash (backslash :\)
Lines terminated by on the fifth line is used to split each line. pay attention to a problem here. if the csv file is generated in windows, use '\ r \ n' to separate the lines ', use '\ n' in linux '.
In the sixth row, ignore 1 lines ignores the first line, because the first line is often the field name, and there is a field in the brackets that is special @ dummy, it means that if there is a field in the csv file that I don't want to insert, I will change the corresponding field name to @ dummy.
PS: To insert the import time by the way, add set update_time = current_timestamp at the end;
2. Export:
The export command is also attached:
Select * from table name into outfile' export path \ test.csv 'Fields terminated by', 'optionally enclosed by' "'escaped by'" 'lines terminated by' \ n ';
Hope to have more communication!