Text: Ttt.txt Total 7,774,865 Records
Display some of the text content as follows:
"object_id", "object_name", "CREATED"
"A", "icol$", "2013/10/9 18:23:42"
"$", "I_user1", "2013/10/9 18:23:42"
"con$", "2013/10/9 18:23:42"
"All", "undo$", "2013/10/9 18:23:42"
"C_cobj#", "2013/10/9 18:23:42"
"3", "i_obj#", "2013/10/9 18:23:42"
"+", "proxy_role_data$", "2013/10/9 18:23:42"
"A", "I_ind1", "2013/10/9 18:23:42"
"WU", "I_cdef2", "2013/10/9 18:23:42"
"Max", "I_obj5", "2013/10/9 18:23:42"
"A", "i_proxy_role_data$_1", "2013/10/9 18:23:42"
"+", "file$", "2013/10/9 18:23:42"
"A", "uet$", "2013/10/9 18:23:42"
"9", "i_file#_block#", "2013/10/9 18:23:42"
"A", "I_file1", "2013/10/9 18:23:42"
"Wuyi", "I_con1", "2013/10/9 18:23:42"
"A", "i_obj3", "2013/10/9 18:23:42"
"7", "i_ts#", "2013/10/9 18:23:42"
"I_cdef4", "2013/10/9 18:23:42"
......
......
To facilitate testing:
We select the first 10001 rows, a total of 10,000 data (the first is the field name)
[11:05:12 [email protected] desktop]$ head-10001 ttt.txt > Test.txt
The following is a script that I have written in Python and Shell, which does not mean that the method of execution of the script is optimal, compared with the individual subjective opinion
Version 1.Shell
#!/bin/sh#mysql_import.sh# reads Ttt.txt files, writes them to the database i=0 ################### #设定新分隔符 ################ # # # #SAVEDIFS = $IFSIFS = ', ' #mysql连接函数mysql_conn () {mysql -e $1;} echo "begin time: ' Date '" #显示开始时间 ############### #创建数据库myimport ################ #mysql -e "drop database if exists myimport;" #mysql_conn "drop database if exists myimport;" mysql -e "create database myimport;" #mysql_conn "create database myimport;" #mysql -e "show databases;" ############## #对ttt. txt for processing ################## #sed -e ' s/\ '//g ' -e ' s#/#-#g ' ttt.txt >t.txt #i = 0 o'clock, read the first line of text, create table import_obj, set I=1#i=1, read other lines of text, insert data into Import_obj # Read data line by row, and pass mysql -e Command INSERT INTO table while read id name createddoif [ $i -eq 0 ]; then #echo $ID, $NAME, $CREATEDid = $IDname = $NAMEcreated = $CREATEDmysql -e "Create table if not exists myimport.import_obj (id int unsigned auto_increment primary key, $ID int unsigned unique not null, $NAME varchar ( not null), $CREATED datetime not null); " i=1#mysql -e "desc myimport.import_obj;" else #echo $CREATED #mysql -e select str_to_date ($CREATED, '%y-%m-%d %h:%i:%s ') ;" mysql -e "Insert into myimport.import_obj ($id, $name, $created) value ($ID, ' $NAME ', ' $ CREATED '); " fidone <t.txtecho "end time: ' Date '" #显示结束时间 #mysql -e "Select * from myimport.import_obj; " rm -f t.txt #删除临时文件IFS = $SAVEDIFS #还原分隔符
2.python version (Python 2.7.5 version)
#!/usr/bin/python# mysql_import.py# uses MYSQLDB to implement Import mysqldb,os,timeos.system ("sed -e " s/ \ "//g ' -e ' s#/#-#g ' test.txt >t.txt");try: print Time.strftime ("%y-%m-%d %h:%m:%s", Time.localtime (Time.time ())) #记录开始时间 Conn=mysqldb.connect (host= ' 127.0.0.1 ', user= ' root ', passwd= ' zhang1992 ') print "conn success "except: print " conn error! " exit () Cur=conn.cursor () Cur.execute (' drop database if exists Myimport ') cur.execute (' Create database myimport ') i=0count=0with open (' T.txt ', ' R ') as Ft: for row in ft.readlines (): id,name,created=list (Row.strip ("\ n"). Split (",")) if I==0: cid=id cname=name created=CREATED cur.execute ("Create table if not exists myimport.import_obj (Id int unsigned auto_increment primary key,%s int unsigned unique not null,%s varchar ( not null,%s datetime not) null) "% (id,name,created)) i=1 else: sql= "Insert into myimport.import_obj (%s,%s,%s) value (%d, '%s ', '%s ')"% (CID, Cname,created,int (ID), name,created) Cur.execute (SQL) Os.system ("RM&NBSP;-F&NBsp;t.txt ") Cur.close () Conn.commit () Conn.close () print time.strftime ("%y-%m-%d %h:%m:%s ", Time.localtime (Time.time ())) #记录结束始时间
3. Testing
10,000 Data Tests
[11:08:32 [email protected] desktop]$ bash mysql_import.sh Begin time:tue Jul 11:08:35 CST 2016End time:tue Jul 11:11:50 CST 2016[11:11:50 [email protected] desktop]$ python mysql_import.py 2016-07-19 11:1 6:46conn success2016-07-19 11:16:52
Python handles full text testing
[13:06:30 [email protected] desktop]$ python mysql_import.py 2016-07-19 13:06:35conn success2016-07-19 13:14:57[ 13:14:57 [email protected] desktop]$ mysql-e "SELECT COUNT (*) from myimport.import_obj"; +----------+| COUNT (*) |+----------+| 774864 |+----------+
As you can see from the above test results, Python is significantly more efficient than the shell.
The shell cannot save the MySQL connection state, resulting in the need to reconnect MySQL and disconnect each execution of an INSERT statement. Here's the MySQLdb in Python is submitted through the transaction, all executed, one commit completes that all inserts, saves the connection and the disconnection time.
This article is from the "Big Waste" blog, please be sure to keep this source http://bigtrash.blog.51cto.com/8966424/1827697
Shell/python implementing MySQL read txt text