To study a Perl script today, several of the regular ones are very confusing:
$text =~ s/([?! ]) + ([\ \ \"\ (\[\? \? \p {Ispi} ]*[\p{isupper}])/$1\ n$2/g; #Multi-dots followed by sentence starters $Text =~s/(\.[\.]+) +([\ '\"\(\[\?\?\p{Ispi}]*[\p{Isupper}])/$\ n$2/g;#Add breaks for sentences so end with some sort of punctuation inside a quote or parenthetical and is followed by a poss ible sentence starter punctuation and upper case$text =~s/([?!\.][\ ]*[\ '\"\)\]\p{IsPf}]+) +([\ '\"\(\[\?\?\p{Ispi}]*[\ ]*[\p{Isupper}])/$1\ n$2/g; # Add breaks for sentences so end with some sort of punctuation is followed by a sentence starter Punctuatio N and upper case $Text =~s/([?!\.]) +([\ '\"\(\[\?\?\p{Ispi}]+[\ ]*[\p{Isupper}])/$\ n$2/g;
Where the character after \p represents a Unicode property. That is, in Perl, each Unicode encoding has a unique attribute, and we can find matching characters based on their respective Unicode properties.
The following is a description of the Unicode properties:
Http://shouce.jb51.net/perl/PatternMatching.html
http://blog.csdn.net/wushuai1346/article/details/7206749
Http://perldoc.perl.org/perluniprops.html
Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.
Perl \p Properties