PHP processing of large amounts of data

Source: Internet
Author: User
For example, we can find millions of data records from two tables. now we need to assemble the data and insert it into another table. what can we do besides array assembly, if arrays are used, how can we ensure that the memory limit is not exceeded. For example, we can find millions of data records from two tables. now we need to assemble the data and insert it into another table. what can we do besides array assembly, if arrays are used, how can we ensure that the memory limit is not exceeded.

Reply content:

For example, we can find millions of data records from two tables. now we need to assemble the data and insert it into another table. what can we do besides array assembly, if arrays are used, how can we ensure that the memory limit is not exceeded.

The mysql_query function is used to query all results and then cache them to the memory. This causes the memory to exceed the limit. another function mysql_unbuffered_query can be used to solve this problem, mysql_unbuffered_query does not cache the result set, but immediately performs operations on the result set after the data is queried, that is, the query side returns, so that the memory is not exceeded, however, mysql_unbuffered_query cannot be used (). Before sending a new SQL query to MySQL, you must extract the result rows generated by all uncached SQL queries. For example:

Code that uses the cache result set:

Function selecttest () {try {$ pdo = new PDO ("mysql: host = localhost; dbname = test", 'root', '123 '); // do not use the cache result set method // $ pdo-> setAttribute (PDO: MYSQL_ATTR_USE_BUFFERED_QUERY, false ); $ Something = $ pdo-> prepare ('select * from test'); $ Something-> execute (); echo 'memory size initially occupied :'. memory_get_usage (). "\ n"; $ I = 0; while ($ result = $ Something-> fetch (PDO: FETCH_ASSOC) {$ I + = 1; if ($ I> 10) {break;} sleep (1); print_r ($ result); echo 'memory usage :'. memory_get_usage (). "\ n" ;}} catch (Exception $ e) {echo $ e-> getMessage ();}}

An error that exceeds the memory will be reported during execution:

Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 204800000 bytes) in E:\ProgramDevelopment\RuntimeEnvironment\xampp\htdocs\test\test.php on line 56Call Stack:    0.0005     135392   1. {main}() E:\ProgramDevelopment\RuntimeEnvironment\xampp\htdocs\test\test.php:0    0.0005     135568   2. test->selecttest() E:\ProgramDevelopment\RuntimeEnvironment\xampp\htdocs\test\test.php:85    0.0050     142528   3. PDOStatement->execute() E:\ProgramDevelopment\RuntimeEnvironment\xampp\htdocs\test\test.php:56

Set $ pdo-> setAttribute (PDO: MYSQL_ATTR_USE_BUFFERED_QUERY, false) in the above code. after a line of comment is removed, it will not be in the cache result set. the result of running this function is as follows:

Initially occupied memory size: 144808 Array ([id] => 1 [a] => v [B] => w [c] => I) occupied memory size: 145544 Array ([id] => 2 [a] => B [B] => l [c] => q) memory usage: 145544 Array ([id] => 3 [a] => m [B] => p [c] => h) memory usage: 145536 Array ([id] => 4 [a] => j [B] => I [c] => B) memory usage: 145536

As you can see, the returned data memory usage is very small, that is, more than 700 bytes, so there will be no error exceeding the memory.

Using arrays... to load them into the memory... hehahaha... it must be quite uncomfortable ~

The solution is the new features of PHP... in fact, it is not very new.

The iteration generator and the (iteration) generator are also a function. The difference is that the return value of this function is returned in sequence, rather than returning only one separate value. or, in other words, the generator enables you to implement the iterator interface more conveniently. the following describes how to implement an xrange function:


  

The above xrange () function provides the same function as the built-in function range () in PHP. however, the difference is that the range () function returns an array containing values from 1 to 1 million 0 (note: Please refer to the manual ). the xrange () function returns an iterator that outputs these values in sequence without actually returning them as arrays.

The advantage of this method is obvious. it allows you to process big data sets without loading them into the memory at a time. you can even process infinite data streams.

Of course, this function can also be implemented by generator, but by inheriting the Iterator interface. however, it is more convenient to implement the iterator interface by using the generator, and there is no need to implement the five methods in the iterator interface.

We recommend that you refer to an article by laruence: using coroutine in PHP to implement multi-task scheduling

The method to avoid memory overflow is simple. It optimizes algorithms and completes tasks in batches to reduce the amount of data read each time.

Write at 1.1

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.