Regular expression-php regular match p tag with specific Chinese

Source: Internet
Author: User

第一章 什么什么

I would like to use PHP to match the P tag and text content.

Explain the situation,
1, p tags may have a carriage return, space;
2, the Chinese characters, "one" will change, "what" will also change

Reply content:

第一章 什么什么

I would like to use PHP to match the P tag and text content.

Explain the situation,
1, p tags may have a carriage return, space;
2, the Chinese characters, "one" will change, "what" will also change

Let me just say this. Without the complicated truth, a few simple use cases will pit you to death:

This is your text.

This is a valid paragraph too, since HTML

paragraph don't have to contain an explicit ending tag. < p id = "sample" > This is another paragraph.

The regular grammar is theoretically insufficient to represent the nesting relationship between tags. In formal grammar, a regular grammar is a subset of the context-independent grammars of HTML. That is, the logic of regular expression, which is not theoretically enough to express the grammatical structure of HTML. In this regard, please refer to the "Compiling principle", "limited automata and formal grammar" the relevant knowledge of the two courses.

In practical applications, the regular is more inadequate (or extremely difficult) to express:

    • Spaces and line breaks inside a label

    • Properties of the label

    • Do not explicitly write the terminating tag

    • Impact of annotations and scripts

This question goes over and over again: do not parse HTML with regular rules, use a canonical parser (Parser). --In some cases, for a particular simple use case, you are happy with the regular. But remember not to write very complex regular expressions, let's not try to use regular to "bug-free, ubiquitous" matching HTML, because sooner or later you will fail.

The HTML parsing of PHP can be implemented by PHP native DOM modules (which may need to be installed in some server environments) or by third-party HTML parsing libraries.

I am now encountering 1 character parsing problems.

You can try the regular expression.

Good...... Now I have 2 questions.

Regular is not good expression, with Strpos match simple point

I came from a question, but I want it that way.

 $s = preg_replace('/

.*(第.{0,8}章\s+[^<]*).*<\/p>/s',"随意$1",$s);

Presumably, the key is to use the S modifier to ignore the carriage return, not the problem of Chinese in the question.

  • Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.