第一章 什么什么
I would like to use PHP to match the P tag and text content.
Explain the situation,
1, p tags may have a carriage return, space;
2, the Chinese characters, "one" will change, "what" will also change
Reply content:
第一章 什么什么
I would like to use PHP to match the P tag and text content.
Explain the situation,
1, p tags may have a carriage return, space;
2, the Chinese characters, "one" will change, "what" will also change
Let me just say this. Without the complicated truth, a few simple use cases will pit you to death:
This is your text.
This is a valid paragraph too, since HTML
paragraph don't have to contain an explicit ending tag. < p id = "sample" > This is another paragraph.
The regular grammar is theoretically insufficient to represent the nesting relationship between tags. In formal grammar, a regular grammar is a subset of the context-independent grammars of HTML. That is, the logic of regular expression, which is not theoretically enough to express the grammatical structure of HTML. In this regard, please refer to the "Compiling principle", "limited automata and formal grammar" the relevant knowledge of the two courses.
In practical applications, the regular is more inadequate (or extremely difficult) to express:
Spaces and line breaks inside a label
Properties of the label
Do not explicitly write the terminating tag
Impact of annotations and scripts
This question goes over and over again: do not parse HTML with regular rules, use a canonical parser (Parser). --In some cases, for a particular simple use case, you are happy with the regular. But remember not to write very complex regular expressions, let's not try to use regular to "bug-free, ubiquitous" matching HTML, because sooner or later you will fail.
The HTML parsing of PHP can be implemented by PHP native DOM modules (which may need to be installed in some server environments) or by third-party HTML parsing libraries.
I am now encountering 1 character parsing problems.
You can try the regular expression.
Good...... Now I have 2 questions.
Regular is not good expression, with Strpos match simple point
I came from a question, but I want it that way.
$s = preg_replace('/
.*(第.{0,8}章\s+[^<]*).*<\/p>/s',"随意$1",$s);
Presumably, the key is to use the S modifier to ignore the carriage return, not the problem of Chinese in the question.