Oracle+python: Copy a table data to table B

Last Update:2017-12-09 Source: Internet

Author: User

Tags prepare

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Recently in the study of Python, see Pythod Oracle, not only can fetch more than one at a time, but also can insert more than one at a time, want to write a copy of the table data into the B table to see if the actual efficiency can be improved. After the discovery, very amazing! Efficiency has been improved by nearly one times! Of course you might think this is meaningless.

To copy data from A to table B there are many methods, the general direct insert can be:

INSERT INTO TableA select * from TableB;

But when the amount of data is very large, to reach the level of hundreds of millions of times, this is very depressed, because it will run very slowly, and can not see progress, and occasionally will be the database because the rollback segment is not enough and tragic.

So, at such times, I usually do it with cursors:

DECLARE  v_num number; begin  v_num:=0;  For V_cur in (select t.prod_inst_id, T.acc_num, t.user_name from Cust30.prod_inst t where rownum <50000) loop in   SERT into test_prod_inst values (v_cur.prod_inst_id, V_cur.acc_num, v_cur.user_name);  v_num:=v_num+1;   If mod (v_num,50000) = 0 Then     commit;   End If;   end loop; end;

(You can also use multiple fetch methods: bulk But the actual test is not much faster). Now the idea is to take python instead of this, the actual code is as follows:

#!/home/orashell/python27/bin/python#-*-coding:utf-8-*-import osimport cx_oracle# need to set this or insert Chinese will be garbled os.environ[' NLS_ LANG '] = ' simplified Chinese_china. UTF8 ' #目的数据库trans_to_db = Cx_oracle.connect (' user/pass# @servicename ') #来源数据库trans_from_db = Cx_oracle.connect (' user/ pass# @servicename ') #打开查询游标curselect = Trans_from_db.cursor () #打开插入游标curinsert = Trans_to_db.cursor () #根据游标生成插入的语句 You need to output such #insert into Test_prod_inst (prod_inst_id,acc_num,user_name) VALUES (: 1,:2,:3) #输入 fromcur based on the cursor already open and the destination table name. For an already opened Cursor object # Enter totable as the destination table name # Output RETURNSTR for the generated sqldef getinsertsql (Fromcur, totable): #习惯这样做:) Mister into a string template and replace Returnstr = ' insert INTO ' +totable+ ' (SELECTSTR) VALUES (insertstr) ' # Get a description of the cursor cx_oracle the cursor description is essentially a tuple (see Next) The first column is the field name #[(' prod_inst_id ', <type ' cx_oracle.number ';, N, None, 0, 0), (' Acc_num ', <type ' cx_oracle.str ING ';, 0, 1), (' user_name ', <type ' cx_oracle.string ';, +, +, none, none,)] Curdesc = fr Omcur.description selectstr = "InserTstr = ' num=0 #拼好字符串模板的 selectstr and Insertstr part for I in Curdesc:num=num+1 selectstr=selectstr +i[0]+ ', ' insertstr=insertstr+ ': ' +str (num) + ', ' #去掉最后一个 ', ' Selectstr=selectstr[0:len (SELECTSTR)-1] Insert    Str=insertstr[0:len (INSERTSTR)-1] #替换 returnstr=returnstr.replace (' Selectstr ', selectstr);    Returnstr=returnstr.replace (' Insertstr ', insertstr); Return returnstr# actually executes the function Def runmain (): #用一个SQL生成游标 curselect.execute (' Select t.prod_inst_id, T.acc_num, T.user_n Ame from Cust30.prod_inst t where rownum<10000 ') #得到插入游标的 manyinserstr=getinsertsql (curselect, ' Test_prod_inst ' ) #插入游标 prepare Curinsert.prepare (MANYINSERSTR) while True: #fetch cx_oracle fetch when fetch a piece of data is a row of the metadata Group But if it is multiple rows get a list #所以 Fetchone result does not convert cannot use Executemany x=curselect.fetchmany (#插入 curins Ert.executemany (None, x) #提交 trans_to_db.commit () #判断退出 If Len (x) ==0:break# Execute if __name__ = = ' __main__ ': Runmain () Trans_from_db.close trans_to_db.close

I thought this would be slower, because in fact, this batch of databases is over the network (data-natively-database), while using Plsql is not using the network. But with this inserted 50 million data, the result is not so, took 64 seconds, and the first way to use the previous article 113 seconds, almost one times the efficiency, this is a database two table replication, if it is two databases, cross-Dblink will be more obvious.

The reason I guess is so two:

A: In the actual insert, Cx_oralce is split into multiple threads to handle. If you consider the actual amount of data in particular, the Plsql side can also be divided into multiple modulo processing, the efficiency may eventually be stuck on the IO.

B:oracle's memory management is more complex, and will consume more resources than the Python equivalent of manual management.

Expect the great God to be able to dispel doubts.

Oracle+python: Copy a table data to table B

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More