PHP processes TXT files and imports massive data into the database _ PHP Tutorial

Source: Internet
Author: User
PHP processes TXT files and imports massive data into the database. There is a TXT file containing 0.1 million records, in the format of: Column 1 column 2 column 3 Column 4 column 5a%313100adductive #1 adducting #1 adducent # 1a%335600nascent # 1a%355300em there is a TXT file, contains 0.1 million records in the following format:

Column 1 column 2 column 3 Column 4 column 5
A 00003131 0 0 adductive #1 adducting #1 adducent #1
A 00003356 0 0 nascent #1
A 00003553 0 0 emerging #2 emergent #2
A 00003700 0.25 0 dissilient #1

........................ There are 0.1 million more ..................


The requirement is to import data into the database. the data table structure is

Automatic word_id increment


Word [adductive #1 adducting #1 adducent #1] This TXT record must be converted to three SQL records


Value = Third Column-fourth column; if it is 0, this record is omitted from the data table.

[Php]
$ File = 'words.txt '; // TXT source file with 10 million records
$ Lines = file_get_contents ($ file );
Ini_set ('memory _ limit ','-1'); // do not limit the Mem size. Otherwise, an error is returned.
$ Line = explode ("\ n", $ lines );
$ I = 0;
$ SQL = "INSERT INTO words_sentiment (word, senti_type, senti_value, word_type) VALUES ";

Foreach ($ line as $ key => $ li)
{
$ Arr = explode ("", $ li );
$ Senti_value = $ arr [2]-$ arr [3];
If ($ senti_value! = 0)
{
If ($ I >= 20000 & $ I <25000) // batch import to avoid failure
{
$ Mm = explode ("", $ arr [4]);
Foreach ($ mm as $ m) // [adductive #1 adducting #1 adducent #1] This TXT record must be converted to 3 SQL records {
$ Nn = explode ("#", $ m );
$ Word = $ nn [0];
$ SQL. = "(\" $ word \ ", 1, $ senti_value, 2),"; // note that word may contain single quotes (such as jack's ), therefore, we must use double quotation marks to include word (escape)
}
}
$ I ++;
}
}
// Echo $ I;
$ SQL = substr ($ SQL, 0,-1); // remove the last comma
// Echo $ SQL;
File_put_contents('20000-25000.txt ', $ SQL); // It takes about 40 seconds to import 5000 entries at a time. importing too many max_execution_time entries at a time may fail.
?>

$ File = 'words.txt '; // TXT source file with 10 million records
$ Lines = file_get_contents ($ file );
Ini_set ('memory _ limit ','-1'); // do not limit the Mem size. Otherwise, an error is returned.
$ Line = explode ("\ n", $ lines );
$ I = 0;
$ SQL = "INSERT INTO words_sentiment (word, senti_type, senti_value, word_type) VALUES ";

Foreach ($ line as $ key => $ li)
{
$ Arr = explode ("", $ li );
$ Senti_value = $ arr [2]-$ arr [3];
If ($ senti_value! = 0)
{
If ($ I >= 20000 & $ I <25000) // batch import to avoid failure
{
$ Mm = explode ("", $ arr [4]);
Foreach ($ mm as $ m) // [adductive #1 adducting #1 adducent #1] This TXT record must be converted to 3 SQL records {
$ Nn = explode ("#", $ m );
$ Word = $ nn [0];
$ SQL. = "(\" $ word \ ", 1, $ senti_value, 2),"; // note that word may contain single quotes (such as jack's ), therefore, we must use double quotation marks to include word (escape)
}
}
$ I ++;
}
}
// Echo $ I;
$ SQL = substr ($ SQL, 0,-1); // remove the last comma
// Echo $ SQL;
File_put_contents('20000-25000.txt ', $ SQL); // It takes about 40 seconds to import 5000 entries at a time. importing too many max_execution_time entries at a time may fail.
?>

1. when importing massive amounts of data, pay attention to some restrictions on PHP. you can temporarily adjust them; otherwise, an error will be reported.


Allowed memory size of 33554432 bytes exhausted (tried to allocate 16 bytes)

2. use PHP to operate TXT files

File_get_contents ()

File_put_contents ()

3. during massive import, it is best to import data in batches, with a lower chance of failure.

4. before importing a large volume of data, the script must be tested multiple times before use. for example, use 100 pieces of data for testing.

5. after the import, if PHP's mem_limit is still insufficient, the program will still fail to run.

(We recommend that you modify php. ini to improve mem_limit, instead of using temporary statements)

Values column 1 column 2 column 3 Column 4 column 5 a 00003131 0 0 adductive #1 adducting #1 adducent #1 a 00003356 0 0 0 nascent #1 a 00003553 0 em...

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.