[Django] batch data import, django Data Import

Source: Internet
Author: User

[Django] batch data import, django Data Import

After a month of review, the examination was finally completed. During this period, I studied how to import data to the database in batches during the Django webpage creation process.

This process is really terrible and has made many low-level mistakes, which will be mentioned in the text. in addition, The py script is used to import data. For the script content, refer to Ziqiang Emy-intermediate tutorial-data import.

Note: This article mainly introduces your own learning experience, not the tutorial!

Body: The bulk_create () function in Django is used to implement the data batch import function. Why did you choose it?

1 bulk_create () is to execute an SQL statement to store multiple pieces of data, making the import faster;

2 bulk_create () reduces the number of SQL statements;

Then, we prepare the data source to be imported. The data source format can be xls, csv, txt, and other text documents;

Finally, compile the py script and run it!

The py script is as follows:

# Coding: UTF-8 import OS. environ. setdefault ("DJANGO_SETTINGS_MODULE", "www. settings ") ''' when the Django version is greater than or equal to 1.7, add the following two statements: import djangodjango. setup () otherwise, the error django will be thrown. core. exceptions. appRegistryNotReady: Models aren't loaded yet. '''import djangoimport datetimeif django. VERSION> = (1, 7): # automatically determines the VERSION of django. setup () from keywork. models import LOrderf = open('cs.csv ') WorkList = [] next (f) # Move the file tag to the next line for line in f: parts = line. replace ('"','') # replace "in the dictionary with null parts = parts. split (';') # Press; to slice the string to WorkList. append (LOrder (serv_id = parts [0], serv_state_name = parts [1], acct_code = parts [2], acct_name = parts [3], acc_nbr = parts [4], user_name = parts [5], frod_addr = parts [6], mkt_chnl_name = parts [7], mkt_grid_name = parts [8], com_chnl_name = parts [9], com_grid_name = parts [10], product_name = parts [11], access_name = parts [12], completed_time = parts [13], remove_data = parts [14], service_offer_name = parts [15], org_name = parts [16], staff_name = parts [17], staff_code = parts [18], handle_time = parts [19], finish_time = parts [20], prod_offer_name = parts [21], eff_date = parts [22], exp_date = parts [23], main_flag = parts [24], party_name = parts [25]) f. close () LOrder. objects. bulk_create (WorkList)

According to the source code of the above py script, the problems encountered during the learning process are mainly described.

Question 1: the first line of the data source to be imported is generally the field name, which is the data starting from the second line. Therefore, the script uses next (f) to move the text mark to the second line for operations, otherwise there will be problems, such as the field name is generally English, the default is string formatting, script execution will encounter ValidationError: YYYY-MM-DD HH: MM [: ss [. uuuuuu] [TZ] The models data format does not match the imported data format!

Question 2: note that parts = parts. split (';') # Press; to segment the string, because there is a gap between each column of data in each row of imported data, such as comma in csv, default delimiter characters such as spaces in xls. The split function uses the following example:

 

 

The following example shows how to use the split () function:

 

#!/usr/bin/pythonstr = "Line1-abcdef \nLine2-abc \nLine4-abcd";print str.split( );print str.split(' ', 1 );

 

The output result of the above instance is as follows:

 

['Line1-abcdef', 'Line2-abc', 'Line4-abcd']['Line1-abcdef', '\nLine2-abc \nLine4-abcd']

Problem 3: If the imported data source exceeds 10 MB and the database defaults to a maximum of 10 MB, the above script will not run successfully. take mysql as an example. If the imported data size exceeds the data setting, the 2006 go away error will be reported. add the following statement under [mysqld] in ini:

Max_allowed_packet = 300 M -- maximum allowable package size: 300 M
Wait_timeout = 200000 -- connection time 200000 s
Interactive_timeout = 200000 -- disconnection time 200000 s

Note: if there are any errors in this article, please point out, thank you!

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.