Implementation of batch processing in PHP for batch House

Source: Internet
Author: User
Tags dsn pear ibm developerworks
What if an attribute in the Web application takes more than 1 seconds or 2 seconds to complete? Some kind of off-line processing solution is required. Learn several ways to take offline services for long-running jobs in PHP applications.
Large chains have a big problem. Every day, thousands of trades occur in each store. Company executives want to dig into the data. Which products sell well? What's bad? Where do organic products sell well? How about the sale of ice cream?
To capture this data, the organization must load all transactional data into a single data model that is more appropriate for generating the reporting type required by the company. However, this takes time, and as the chain grows, it can take more than a day to process the data for the day. Therefore, this is a big problem.
Now, your WEB application may not need to process so much data, but any site can handle more time than the customer is willing to wait. In general, the customer is willing to wait 200 milliseconds, if more than this time, the customer will feel the process is "slow". This number is based on desktop applications, and the WEB has made us more patient. However, customers should not be allowed to wait longer than a few seconds. Therefore, there are some strategies to handle batch jobs in PHP.
Scattered way with cron
On the UNIX® machine, the core program that executes the batch process is the cron daemon. The daemon reads a configuration file that tells it which command line to run and how often it runs. The daemon then executes them as configured. When an error is encountered, it can even send an error output to the specified e-mail address to help debug the problem.
I know that some engineers strongly advocate the use of threading technology. Thread A thread is the real way to do background processing. The cron daemon is too outdated. ”
I don't think so.
I have used both of these methods, and I think Cron has the advantage of "Keep It Simple, Stupid (KISS, Simplicity is Beauty)" principle. It keeps the background processing simple. Instead of writing a running multithreaded job-processing application (so there is no memory leak), a simple batch script is started by Cron. This script determines if there is a job to process, executes the job, and then exits. There is no need to worry about memory leaks. There is no need to worry about threads stopping or falling into an infinite loop.
So, how does cron work? This depends on the system environment in which you are located. I'm only talking about the UNIX command-line version of the old-fashioned simple cron, and you can ask your system administrator how to implement it in your own WEB application.
Here's a simple cron configuration that runs a PHP script every night at 11 o'clock:
0 * * * jack/usr/bin/php/users/home/jack/myscript.php
The first 5 fields define when the script should start. Then the user name that should be used to run the script. The remaining commands are the command line to execute. The time fields are minutes, hours, days, months, and days of the week, respectively. Here are a few examples.
Command:
* * * * * jack/usr/bin/php/users/home/jack/myscript.php
Run the script at the 15th minute of every hour.
Command:
15,45 * * * * jack/usr/bin/php/users/home/jack/myscript.php
Run the script at the 15th and 45th minutes of each hour.
Command:
*/1 3-23 * * * jack/usr/bin/php/users/home/jack/myscript.php
Run the script every minute between 3 and 11 o'clock in the morning.
Command
* * 6 jack/usr/bin/php/users/home/jack/myscript.php
Run the script 11:30 every Saturday night (Saturday is specified by 6).
As you can see, the number of combinations is infinite. You can control when the script is run as needed. You can also specify multiple scripts to run, so that some scripts can run every minute, while other scripts (such as backup scripts) can run only once a day.
In order to specify which e-mail address to send the reported errors to, you can use the MAILTO directive as follows:
Mailto=jherr@pobox.com
Note: For microsoft®windows® users, there is an equivalent scheduled Tasks system that can be used to start command-line processes regularly (such as PHP scripts).
Back to top of page
Basics of Batch Architecture
Batch processing is fairly straightforward. In most cases, one of two workflows is used. The first workflow is used for reporting, and the script runs once a day, generating reports and sending reports to a group of users. The second workflow is a batch job that is created in response to some kind of request. For example, I log in to the WEB application and ask it to send a message to all users registered in the system and tell them a new feature. This operation must be batched because there are 10,000 users in the system. PHP takes a while to complete such a task, so it must be performed by a job outside of the browser.
In the second workflow, the WEB application simply places the information in a location where it is shared by the batch application. This information specifies the nature of the job (for example, "Send this e-mail to all the people on the system". The batch program runs the job, and then deletes the job. Another approach is that the handler marks the job as completed. Either way, the job should be recognized as completed so that it does not run again.
The remainder of this article demonstrates various ways to share data between the front-end of the Web application and the batch backend.
Back to top of page
Message Queue
The first approach is to use a dedicated mail queuing system. In this model, a table in the database contains e-mail messages that should be sent to individual users. The Web interface uses the Mailouts class to add e-mail messages to the queue. The e-mail handler uses the Mailouts class to retrieve an unhandled e-mail message, and then uses it again to remove the unhandled e-mail from the queue.
This model requires MySQL mode first.
Listing 1. Mailout.sql
DROP TABLE IF EXISTS mailouts; CREATE TABLE mailouts (id mediumint not NULL auto_increment, from_address text NOT NULL, to_address text NOT NULL, SUBJEC T text not NULL, content TEXT is not null, PRIMARY KEY (id));
This pattern is very simple. There is one from and one to address in each line, as well as the subject and content of the email.
The Mailouts table in the database is processed by the PHP mailouts class.
Listing 2. mailouts.php
GetMessage ()); } return $db; public static function Delete ($id) {$db = mailouts::get_db (), $sth = $db->prepare (' Delete from mailouts WHERE id= ?' ); $db->execute ($sth, $id); return true; public static function Add ($from, $to, $subject, $content) {$db = mailouts::get_db (); $sth = $db->prepare (' INSERT into mailouts VALUES (null,?,?,?,?) '); $db->execute ($sth, Array ($from, $to, $subject, $content)); return true; public static function Get_all () {$db = mailouts::get_db (), $res = $db->query ("SELECT * from mailouts"); $rows = a Rray (); while ($res->fetchinto ($row)) {$rows []= $row;} return $rows; }}?>
This script contains Pear::D B Database Access class. It then defines the Mailouts class, which contains three main static functions: Add, Delete, and Get_all. The Add () method adds an e-mail message to the queue, which is used by the front end. The Get_all () method returns all data from the table. Delete () method deletes an e-mail message.
You might ask, why do I not just call the Delete_all () method at the end of the script? There are two reasons why you might not do this: if you delete a message after it is sent, the message cannot be sent two times, even if the script is rerun after a problem occurs, and a new message may be added between the start and finish of the batch job.
The next step is to write a simple test script that adds an entry to the queue.
Listing 3. mailout_test_add.php

In this example, I add a mailout, which is sent to a company's Molly, which includes the subject "Test Subject" and the email body. You can run this script on the command line: PHP mailout_test_add.php.
In order to send an e-mail message, another script is required, which acts as a job handler.
Listing 4. mailout_send.php

This script uses the Get_all () method to retrieve all e-mail messages, and then uses the PHP Mail () method to send messages one at a. After each e-mail message is successfully sent, the Delete () method is called to delete the corresponding record from the queue.
Run this script periodically using the cron daemon. The frequency of running this script depends on the needs of your application.
Note: The PHP Extension and Application Repository (PEAR) repository contains an excellent mail queuing system implementation that can be downloaded for free.
Back to top of page
A more general approach
The solution dedicated to sending e-mail is good, but is there a more general approach? We need to be able to send e-mails, generate reports, or perform other time-consuming processing without having to wait for processing to complete in the browser.
To do this, you can take advantage of the fact that PHP is an interpreted language. You can store your PHP code in a queue in the database and then execute it later. This requires two tables, as shown in Listing 5.
Listing 5. Generic.sql
DROP TABLE IF EXISTS processing_items; CREATE TABLE processing_items (id mediumint NOT NULL auto_increment, function TEXT not NULL, PRIMARY KEY (ID));D ROP TAB LE IF EXISTS Processing_args; CREATE TABLE Processing_args (id mediumint not NULL auto_increment, item_id mediumint isn't null, key_name TEXT NOT NULL, V Alue TEXT not NULL, PRIMARY KEY (id));
The first table Processing_items contains the functions that the job handler calls. The second table, Processing_args, contains the parameters to be sent to the function, in the form of a hash table of key/value pairs.
As with the Mailouts table, these two tables are also wrapped by the PHP class, which is called Processingitems.
Listing 6. generic.php
Prepare (' DELETE from Processing_args WHERE item_id=? '); $db->execute ($sth, $id); $sth = $db->prepare (' DELETE from Processing_items WHERE id=? '); $db->execute ($sth, $id); return true; public static function Add ($function, $args) {$db = processingitems::get_db (); $sth = $db->prepare (' INSERT into P Rocessing_items VALUES (null,?) '); $db->execute ($sth, Array ($function)); $res = $db->query ("Select last_insert_id ()"); $id = null; while ($res->fetchinto ($row)) {$id = $row [0];} foreach ($args as $key + = $value) {$sth = $db->prepare (' in SERT into Processing_args VALUES (null,?,?,?) '); $db->execute ($sth, Array ($id, $key, $value)); } return true; public static function Get_all () {$db = processingitems::get_db (), $res = $db->query ("SELECT * from Processing_item S "); $rows = Array (); while ($res->fetchinto ($row)) {$item = array (); $item [' id '] = $row [0]; $item [' function '] = $row [1]; $item [' args '] = Array (); $ares = $db->qUery ("Select Key_name, value from Processing_args WHERE item_id=?", $item [' id ']); while ($ares->fetchinto ($arow)) $item [' args '] [$arow [0]] = $arow [1]; $rows []= $item; } return $rows; }}?>
This class contains three important methods: Add (), Get_all (), and delete (). Like the mailouts system, the front end uses add (), and the processing engine uses Get_all () and delete ().
The test script shown in Listing 7 adds an entry to the processing queue.
Listing 7. generic_test_add.php
' foo '); >
In this example, a call to the Printvalue function is added and the value parameter is set to Foo. I ran the script using the PHP command-line interpreter and put the method call into the queue. Then run the method using the following processing script.
Listing 8. generic_process.php

This script is very simple. It obtains the processing entry returned by the Get_all (), and then uses Call_user_func_array (a PHP intrinsic function) to invoke the method dynamically with the given parameters. In this example, the local Printvalue function is called.
To demonstrate this functionality, let's look at what happened on the command line:
% php generic_test_add.php% php generic_process.php printing:foo%
The output is not much, but you can see the point. With this mechanism, you can defer the processing of any PHP function.
Now, if you don't like to put PHP function names and parameters into a database, the other way is to create a mapping between the "processing job type" name and the actual PHP processing function in the PHP code. In this way, if you later decide to modify the PHP backend, the system will still work as long as the "Processing job type" string matches.
Back to top of page
Discard Database
Finally, I show another slightly different solution that uses files in one directory to store batch jobs instead of using databases. The idea here is not to suggest that you "use this rather than a database", which is just a choice, and whether it's up to you to decide.
Obviously, there is no pattern in this solution because we do not use the database. So write a class that contains the Add (), Get_all (), and Delete () methods that are similar to the previous example.
Listing 9. batch_by_file.php
$v) {fprintf ($fh, $k. ":". $v. " \ n "); } fclose ($FH); return true; public static function Get_all () {$rows = array (), if (Is_dir (batch_directory)) {if ($dh = Opendir (batch_directory)) { while (($file = Readdir ($DH))!== false) {$path = batch_directory. $file; if (Is_dir ($path) = = False) {$item = array (); $item [' id '] = $path; $fh = fopen ($path, ' R '); if ($fh) {$item [' function '] = Trim (fgets ($FH)); $item [' args '] = array (); while (($line = fgets ($fh)) = null) { $args = Split (': ', Trim ($line)); $item [' args '] [$args [0]] = $args [1]; } $rows []= $item; Fclose ($FH); }}} closedir ($DH); }} return $rows; }}?>
There are three main methods of the Batchfiles class: Add (), Get_all (), and delete (). This class does not access the database, but reads and writes the files in the Batch_items directory.
Use the following test code to add a new batch entry.
Listing 10. batch_by_file_test_add.php
' foo '); >
One thing to note: In addition to the class name (Batchfiles), there is virtually no indication of how the job is stored. Therefore, it is easy to change it to a database-style storage mode, without modifying the interface.
Finally, the code that handles the program.
Listing 11. batch_by_file_processor.php

This code is almost identical to the database version, except that it modifies the file name and class name.
Back to top of page
Conclusion
As mentioned earlier, the server provides a lot of support for threads, and can be batched in the background. In some cases, it is certainly easier to use worker threads to handle small jobs. However, you can also use legacy tools (cron, MySQL, standard object-oriented PHP, and Pear::D b) To create batch jobs in a PHP application, which is easy to implement, deploy, and maintain.
Resources
Learn
You can refer to the original English text on the DeveloperWorks global site in this article.
Read more about PHP in the IBM developerWorks PHP Project Resource Center.
Php.net is a great resource for PHP developers.
The PEAR Mail_queue package is a robust message queue implementation that includes the database backend.
The Crontab manual provides details of the cron configuration, but it is not easy to understand.
The section on Using PHP from the command line in the PHP manual can help you learn how to run scripts from Cron.
Stay tuned for DeveloperWorks technical events and webcast.
Learn about upcoming conferences, exhibitions, webcasts, and other events around the world, where IBM open source developers can learn about the latest technological developments.
Visit the DeveloperWorks Open Source technology zone for extensive how-to information, tools, and project updates to help you develop with open source technology and use it with IBM products.
DeveloperWorks podcasts includes a lot of interesting interviews and discussions for software developers.
Access to products and technologies
Check out PEAR-PHP Extension and Application Repository, which contains PEAR::D B.
Use IBM trial software to improve your next open source development project, which can be downloaded or obtained via DVD.
Discuss
DeveloperWorks PHP Developer Forum provides a place for all PHP developers to discuss technical issues. If you have questions about PHP scripts, functions, syntax, variables, debugging, and other topics, you can put them here.
Join the DeveloperWorks community by participating in the DeveloperWorks blog.
About the author
Jack D. Herrington is a senior software engineer with over more than 20 years of working experience. He has authored three books: Code Generation in Action, podcasting Hacks and PHP Hacks, and has authored more than 30 articles.

The above describes the batch process of the home of the batch processing of the implementation, including the batch of the home side of the content, I hope the PHP tutorial interested in a friend helpful.

  • Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.