Implementation method of importing massive TXT data into library

Source: Internet
Author: User
Tags import database
    1. Column 1 column 2 column 3 column 4 column 5
    2. A 00003131 0 0 adductive#1 adducting#1 adducent#1
    3. A 00003356 0 0 nascent#1
    4. A 00003553 0 0 emerging#2 emergent#2
    5. A 00003700 0.25 0 dissilient#1
    6. --Total data 100,000--
Copy Code

Requirements are imported into the database. The data table structure is:

    1. word_id Auto Increment
    2. Word "adductive#1 adducting#1 adducent#1" This TXT record to convert to 3 SQL records
    3. Value = Third column-fourth column; if = 0, the record is skipped over without inserting the datasheet
Copy Code

The code is as follows:

!--? php/** * TXT mass data storage * http://bbs.it-home.org*/$file = ' words.txt ';// 10W Record txt source file $lines = file_get_contents ($file); Ini_set (' Memory_limit ', '-1 ');//Do not limit mem size, otherwise error $line=explode ("\ n", $lines); $i =0; $sql = "INSERT into Words_sentiment (word,senti_type,senti_value,word_type) VALUES"; foreach ($line as $key =--> $li) {$arr =explode ("", $li) $senti _value= $arr [2]-$arr [3];if ($senti _value!=0) {if ($i >=20000&& $i < 25000)//Sub-batch import, avoid failure {$mm =explode ("", $arr [4]); foreach ($mm as $m)//"Adductive#1 adducting#1 adducent#1" This TXT record is to be converted to 3 SQL records {$nn =explode ("#", $m), $word = $nn [0]; $sql. = "(\" $word \ ", 1, $senti _value,2),";//This place to notice is Word has the potential to include single quotes (such as Jack's), so we'll use double quotes to include word (note escaping)}} $i + +;}} echo $i; $sql =substr ($sql, 0,-1);//Remove the last comma//echo $sql; file_put_contents (' 20000-25000.txt ', $sql); Batch Import database, 5,000 one time, about 40 seconds to look; too many max_execution_time will not be enough to cause failure? 

Description: 1, the massive data import, you should pay attention to some of the limitations of PHP, you can temporarily adjust, or will error. Allowed memory size of 33554432 bytes exhausted (tried to allocate bytes)

2,php Operation TXT file file_get_contents () file_put_contents ()

3, Mass import, the best batch import, the chance of failure a little bit 4, before the massive import, the script must be repeated testing and use, such as 100 data to test 5, after the import, if PHP mem_limit is not enough, The program is still not running (it is recommended to modify the php.ini to improve the mem_limit, instead of using temporary statements)

That's all, who has the guts, you can find a big data on the TXT test, hehe.

  • Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.