Web page capture information (php regular expression, php excel operation) _ PHP Tutorial

Source: Internet
Author: User
Tags php excel
Web page capture information (php regular expression, php excel operation ). Web page capture information (php regular expression, php excel operation) 1. problem description: captures the information you need on a fixed web page and stores it in tables. I took a row of web page capture information (php regular expression, php excel operation) on wustoj)

1. problem description

Captures the information you need on a fixed web page and stores it as a table. I practiced using a ranking list on wustoj. Address: wustoj

2. ideas

The webpage simply learned php and used it to do something. my idea is as follows:

(1) view the source code of the webpage and save it in a file.

(2) write regular expressions based on required information, read files, and extract required information based on regular expressions. It is best to group when writing regular expressions, which makes it much easier to extract.

(3) operate on excel and output the extracted information in the form of excel.

Better open-source php excel processing links: click to open the link

3. experience

^ Indicates the start of the original string, and $ indicates the end of the original string.
Null characters are not necessarily spaces.
Grouping with () is a good method, such as preg_macth_all (/$ pattern/, $ subject, matches ).
Matches is a two-dimensional array. if there is no _ all, it will only match the first part, which is a one-dimensional array.
$ Matches [0] saves all matches in full mode. $ Matches [1] saves all matches in the first sub-group, that is, the first part of all matches.
This $ patt_ch = chr (0x80). "-". chr (0xff) is used for Chinese matching strings ).

4. code

 1team30 _ NAME $ namepatt = "() (\ * {0, 1} team [0-9] +) (_) ([$ patt_ch] +) (<\/a>) "; // part2 part4 // $ namepatt =" (team [0-9] +) (_) ([$ patt_ch] + )"; you can also use this to directly match "team _ name" // 7 $ problempatt = "() ([0-9] +) (<\/a> )"; // Include classrequire_once ('classes/PHPExcel. php '); require_once ('classes/PHPExcel/Writer/excel2007.php'); $ objPHPExcel = new PHPExcel (); // Set properties to Set File attributes $ objPHPExcel-> getProperties () -> setCreator ("Maarten Balliauw "); $ ObjPHPExcel-> getProperties ()-> setLastModifiedBy ("Maarten Balliauw"); $ objPHPExcel-> getProperties ()-> setTitle ("Office 2007 XLSX Test Document "); $ objPHPExcel-> getProperties ()-> setSubject ("Office 2007 XLSX Test Document"); $ objPHPExcel-> getProperties ()-> setDescription ("Test document for Office 2007 XLSX, generated using PHP classes. "); $ objPHPExcel-> getProperties ()-> setKeywords (" office 2007 openxml ph P "); $ objPHPExcel-> getProperties ()-> setCategory (" Test result file "); $ row = 1; $ objPHPExcel-> getActiveSheet () -> setCellValue ('A '. $ row, 'rank '); $ objPHPExcel-> getActiveSheet ()-> setCellValue (' B '. $ row, 'team'); $ objPHPExcel-> getActiveSheet ()-> setCellValue ('C '. $ row, 'solved'); while (! Feof ($ file) {// echo $ row. ""; $ line = fgets ($ file); if (preg_match ("/$ rankpatt/", $ line, $ match) {$ row ++; // print_r ($ match); // echo $ match [2]. ""; // echo ""; $ objPHPExcel-> getActiveSheet ()-> setCellValue ('A '. $ row, $ match [2]); $ objPHPExcel-> getActiveSheet ()-> getStyle ('A '. $ row)-> getAlignment ()-> setHorizontal (PHPExcel_Style_Alignment: HORIZONTAL_LEFT);} if (preg_match ("/$ namepatt/", $ line, $ match )) {// print_r ($ match); // e Cho $ match [2]. "". $ match [4]. ""; // echo ""; $ objPHPExcel-> getActiveSheet ()-> setCellValue ('B '. $ row, $ match [2]. $ match [4]);} if (preg_match ("/$ problempatt/", $ line, $ match) {// print_r ($ match ); // echo $ match [2]. ""; // echo ""; $ objPHPExcel-> getActiveSheet ()-> setCellValue ('C '. $ row, $ match [2]); $ objPHPExcel-> getActiveSheet ()-> getStyle ('C '. $ row)-> getAlignment ()-> setHorizontal (PHPExcel_Style_Alignment: HORIZONTAL_LE FT);} $ objWriter = new PHPExcel_Writer_Excel2007 ($ objPHPExcel); $ objWriter-> save (str_replace ('. php ', '.xlsx', _ FILE _);} echo "well done :)";?>


5. running result

Compile (php regular expression, php excel operation) 1. problem description: captures the information you need on a fixed web page and stores it as a table. I'm taking a row on wustoj...

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.