Php shtmlspecialchars function details

Last Update:2013-11-22 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Because I was a newbie in conong, I haven't officially started writing large engineering code. So the old employee gave me a project code for a large PHP project last year. Let's take a look at it first. In the afternoon, I met

The shtmlspecialchars () function is used by many people on the Internet. However, it is not provided by PHP, but not officially written. However, the regular expression in this section makes me tangle with each other. If I don't talk nonsense, let's get into the question.

[Php]
Function shtmlspecialchars ($ string ){
If (is_array ($ string )){
Foreach ($ string as $ key => $ val ){
$ String [$ key] = shtmlspecialchars ($ val );
}
} Else {
$ String = preg_replace ('/& amp; (# (\ d {} | x [a-fA-F0-9] {4 }) | [a-zA-Z] [a-z0-9] {2, 5});)/',' & \ 1 ',
Str_replace (array ('&', '"', '<', '>'), array ('& amp;', '& quot;', '& lt; ',' & gt; '), $ string ));
}
Return $ string;
}

The above is the definition of the shtmlspecialchars () function. If you do not want to talk about it, many people are worried about it.

[Php]
$ String = preg_replace ('/& (# (\ d {3, 5} | x [a-fA-F0-9] {4 }) | [a-zA-Z] [a-z0-9] {2, 5});)/',' & \ 1 ',
Str_replace (array ('&', '"', '<', '>'), array ('&', '"', '<', '> '), $ string ));

Here we will first introduce the functions of this function:

Escape the four special characters that may appear in html:

& Convert & amp;

"To & quot;

<Switch & lt;

> Convert & lt; (ps: The semicolon ";" behind this is connected together, a whole, not used by the author for separation)

This is the opposite of htmlspecialchars () in PHP.

In general, the following code is used to implement the function.

[Php]
Str_replace (array ('&', '"', '<', '>'), array ('&', '"', '<', '> '), $ string ));

But wait!

Q: What are you waiting? Have you completed this function?

A: Yes, it's a big mistake. It's really wrong. You 'd rather kill 3000 and never let it go.

Q: What is the error?

A: The following content is displayed!

If we only use the above functions, the special characters in html and unicode encoding will be destroyed. This is not the result. For details, see the attachment below the article.

Someone observed all the data in the orders table and finally came to the following conclusion:

1. special characters in html are strings consisting of 3-5 numbers or 1 character and 2-5 characters or numbers after the start & #
2. unicode encoding is a string consisting of 4 hexadecimal numbers starting.

According to the first one, we should write the regular expression: & #/d {3, 5} | [a-zA-Z] [a-zA-Z0-9] {}; (ps: this is also the built-in Semicolon ";)

According to the second, we can get & # [a-fA-F0-9] {4}; (ps: Because hexadecimal is from 0-f)

As the previous operation has replaced & with & amp;, the following is displayed in the above two integrations:

/& Amp; (# (\ d {3, 5} | x [a-fA-F0-9] {4}) | [a-zA-Z] [a-z0-9] {2, 5 });)/

Question 1:
Someone asked, can it be written as follows?

/& Amp; # (\ d {} | x [a-fA-F0-9] {4}) | [a-zA-Z] [a-z0-9 });)/

Yes, of course. But if you want to write it like this, I will mention it later and make some changes.

Step 1
[Php]
Str_replace (array ('&', '"', '<', '>'), array ('&', '"', '<', '> '), $ string ));

The result is written as $ string.

Then we can simply write it as a reverse replacement.

Preg_replace ('/& amp; (# (\ d {3, 5} | x [a-fA-F0-9] {4 }) | [a-zA-Z] [a-z0-9] {2, 5});)/',' & \ 1', $ string)

Here, the regular expression above is quite clear, but the author is confused by & \ 1. What does it mean?

It is verified that \ 1 represents the content in the first bracket of the regular expression.

I wrote a test myself.

[Php]
<? Php

$ String = 'x10p ';
$ String1 = preg_replace ('/(x) ([0-9] +) p/', '& \ 1', $ string );
$ String2 = preg_replace ('/x ([0-9] +) p/', '& \ 1', $ string );
Echo $ string1;
Echo '<br/> ';
Echo $ string2;
?>

The output results are as follows:

In & x, x is enclosed in brackets.
& 10 the first parenthesis is 10.
[Php]
Preg_replace ('/& (# (\ d {3, 5} | x [a-fA-F0-9] {4}) | [a-zA-Z] [a-z0-9] {2, 5 });) /',' & \ 1', $ string)

The result is to replace $ amp; with &, while the subsequent values remain unchanged.

This can solve the above problem 1. Can I take out #? If you take out #, it will replace & amp; #, then you have to write '& # \ 1' next to it, so you can, but do you feel it?

How can this problem be solved? Yes!

Appendix:

Html sequence table

Special symbol naming entity decimal encoding special symbol naming entity decimal encoding
Latency & Alpha; & #913; latency & Beta; & #914; Gamma & Gamma; & #915;
Delta & Delta; & #916; Middle & Epsilon; & #917; Middle & ETA; & #918;
Expire & Eta; & #919; then & Theta; & #920; then & Iota; & #921;
Role & Kappa; & #922; Lambda & Lambda; & #923; Role & Mu; & #924;
Region & Nu; & #925; Region & Xi; & #926; Region & Omicron; & #927;
Gini & Pi; & #928; Gini & ROV; & #929; Σ & Sigma; & #931;
Round & Tau; & #932; Round & Upsilon; & #933; Phi & Phi; & #934;
Region & Chi; & #935; Region & Psi; & #936; Ω & Omega; & #937;
α & alpha; & #945; β & beta; & #946; gamma & gamma; & #947;
Delta & delta; & #948; ε & epsilon; & #949; ε & ETA; & #950;
Eta & eta; & #951; θ & theta; & #952; Middle & iota; & #953;
Kappa & kappa; & #954; λ & lambda; & #955; μ & mu; & #956;
ν & nu; & #957; ε & xi; & #958; large & omicron; & #959;
π & pi; & #960; P & ROV; & #961; large & sigmaf; & #962;
σ & sigma; & #963; tau & tau; & #964; Round & upsilon; & #965;
Phi & phi; & #966; chi; & #967; psi & psi; & #968;
ω & omega; & #969; Middle & thetasym; & #977; Middle & upsih; & #978;
Middleware & piv; & #982; • & bull; & #8226 ;... & Amp; hellip; & amp; #8230;
'& Prime; & #8242; "& Prime; & #8243; queue & oline; & #8254;
Watermark & frasl; & #8260; watermark & weierp; & #8472; watermark & image; & #8465;
Latency & real; & #8476;™& Trade; & #8482; balance & alefsym; & #8501;
Region & larr; & #8592; Region & uarr; & #8593; → & rarr; & #8594;
Middleware & darr; & #8595;↔& Harr; & #8596; Small & crarr; & #8629;
Region & lArr; & #8656; Region & uArr; & #8657; Region & rArr; & #8658;
Region & dArr; & #8659; Region & hArr; & #8660; Region & forall; & #8704;
Parts & part; & #8706; Parts & exist; & #8707; Parts & empty; & #8709;
Region & nabla; & #8711; ε & isin; & #8712; Region & notin; & #8713;
Latency & ni; & #8715; latency & prod; & #8719; Σ & sum; & #8722;
− & Minus; & #8722; lower & lowast; & #8727; √ & radic; & #8730;
Latency & prop; & #8733; ∞ & infin; & #8734; latency & ang; & #8736;
Between & and; & #8869; between & or; & #8870; between & cap; & #8745;
Duration & cup; & #8746; duration & int; & #8747; duration & there4; & #8756;
Watermark & sim; & #8764; watermark & cong; & #8773; ≈ & asymp; & #8773;
==& Ne; & #8800; bytes & equiv; & #8801; ≤& le; & #8804;
≥& Ge; & #8805; Region & sub; & #8834; Region & sup; & #8835;
Region & nsub; & #8836; Region & sube; & #8838; Region & supe; & #8839;
Region & oplus; & #8853; Region & otimes; & #8855; Region & perp; & #8869;
Counter & sdot; & #8901; counter & lceil; & #8968; counter & rceil; & #8969;
Activities & lfloor; & #8970; Activities & rfloor; & #8971; Activities & loz; & #9674;
♠& Amp; spades; & amp; #9824;♣& Amp; clubs; & amp; #9827;♥& Hearts; & amp; #9829;
♦& Diams; & #9830; & nbsp; & #160; explain & iexcl; & #161;
Middle & cent; & #162; £& pound; & #163; Middle & curren; & #164;
¥ & Yen; & #165; Clerk & brvbar; & #166; § & sect; & #167;
? & Uml; & #168;©& Copy; & #169; bytes & ordf; & #170;
«& Laquo; & #171; Region & not; & #172; & shy; & #173;
®& Reg; & #174; large & macr; & #175; ° & deg; & #176;
± & Plusmn; & #177; ² & sup2; & #178; large & sup3; & #179;
'& Acute; & #180; µ& micro; & #181 "& quot; & #34;
<& Lt; & #60;> & gt; & #62; '& #39;

Author: wolinxuebin

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Php shtmlspecialchars function details

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Php shtmlspecialchars function details

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support