International - English

Cart Console

Topic Center

Contact Sales

Home > Tutorials > PHP Tutorials

A little note (encoding conversion and regular matching) based on data processing after Preg_match_all acquisition _php tutorial

Last Update:2016-07-13 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

1, using curl to achieve off-site acquisition

For details, please refer to my previous note: http://www.jb51.net/article/46432.htm

2. Code Conversion
First, by looking at the source code to find the site used by the code, through the Mb_convert_encoding function to transcode;

Specific Use method:

Copy the Code code as follows:
The source character is $str

The following known source code is GBK and converted to Utf-8
Mb_convert_encoding ($str, "UTF-8", "GBK");

The following unknown original code, automatically detected by auto, the conversion code is UTF-8
Mb_convert_encoding ($str, "UTF-8", "Auto");

3, in order to better avoid obstacles such as line breaks and spaces, it is necessary to first clear the collected source of line breaks, space characters and tabs

Copy the Code code as follows:
Method one, replace with Str_replace
$contents = Str_replace ("\ r \ n", ", $contents); Clear line break
$contents = Str_replace ("\ n", "", $contents); Clear line break
$contents = Str_replace ("\ T", "', $contents); Clear tabs
$contents = Str_replace ("", "', $contents); Clear whitespace

Method two, replacing with a regular expression
$contents = Preg_replace ("/([\r\n|\n|\t|] +)/",", $contents);

4. Use regular expression matching to find the code snippet to be obtained and implement the match using Preg_match_all

Copy the Code code as follows:
Function Explanation:
int Preg_match_all (string pattern, string subject, array matches [, int flags])
pattern is the regular expression
Subject is the original text to be searched
Matches is an array for storing output results
Flags is a stored pattern, including:
Preg_pattern_order; The entire array is a two-dimensional array, $arr 1[0] is an array of matched strings consisting of the bounds, $arr 1[1] To remove the array of matching strings formed by the boundary
Preg_set_order; The entire array is a two-dimensional array, $arr 2[0][0] is the first matching string consisting of a boundary, $arr 2[0][1] is the first matching string that is to be removed from the boundary, and so on.
Preg_offset_capture; The entire array is a three-dimensional array, $arr 3[0][0][0] is the first matching string that includes the bounds, $arr 3[0][0][1] is the offset to the boundary of the first matching string (the boundary is not counted), and so on, $arr 2[1][0][0] is the first matching string consisting of a boundary, $arr 3[1][0][1] is the offset to the boundary of the first matching string (boundary count);

Practical application
Preg_match_all ('/ (. *?) <\/p>/', $contents, $out, Preg_set_order);
$out will get all the matching elements
$out [0][0] will be included

Full-length characters, including
$out [0][1] will be included only (. *?) The segment of the character to match in parentheses

By analogy, the nth matching field can be obtained in the following way
$out [N-1][1]

Jo Zheng A large number of parentheses in the expression, the method of obtaining the M-match point in the sentence is
$out [N-1][m]

5, get to find the character, to remove the HTML tag, using PHP's own function strip_tags can be easily implemented

Copy the Code code as follows:
Cases
$result =strip_tags ($out [0][1]);

http://www.bkjia.com/PHPjc/728086.html www.bkjia.com true http://www.bkjia.com/PHPjc/728086.html techarticle 1, using curl to achieve off-site acquisition specific please refer to my previous note: http://www.jb51.net/article/46432.htm 2, the code conversion first by looking at the source code to find the collection of the website used by the compilation ...



This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

Related Keywords:

Analysis of references and garbage collection in PHP 09-04

PHP service Nginx cannot use file_get_contents workaround 09-07

PHP-based export to Excel or CSV (with UTF8, GBK encoding con... 12-26

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

What's Trending

Top 10 Tags

datastax versions naming convention zookeeper client class definition md5 microsoft sql server 2005 data structures exception handling error handling

Top 10 Keywords

microsoft download center down wordpress address url site address url wordpress address url windows installer 4 0 download 302 not found web address url definition site address url wordpress db2 integer mac os installation step by step pdf abbreviation for return

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

A little note (encoding conversion and regular matching) based on data processing after Preg_match_all acquisition _php tutorial

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support