PHP fixes HTML Tag implementation code that is not properly closed (nested and close proximity are supported)

Source: Internet
Author: User

FixHtmlTag
Version 0.2:
This version solves the problems left over from the last time, that is, proximity closure and nested closure. For more information, see the comments of the Code. Copy codeThe Code is as follows: <? Php

/**
* FixHtmlTag
*
* HTML Tag repair function. This function can fix HTML tags that are not properly closed.
*
* Due to too many uncertainties, the nested closed mode and
* "Nearby close mode" should be enough.
*
* These two modes are the two terms I created to explain the implementation of this function,
* You only need to understand what it means.
* 1. nested closed mode, NEST, which is the default closed mode. That is, "<body> <div> hello"
* Such html code is changed to "<body> <div> Hello </div> </body>"
* 2. CLOSE mode: CLOSE mode. This mode is like "<p> Hello <p> Why not
* Close it. "The Code is changed to" <p> Hello </p> <p> why is it not closed? </p>"
*
* In nested close mode (by default, special parameters are not required), you can input the values that need to be closed nearby.
* The tag name. In this way, convert a label name similar to "<body> <p> Hello </p> <p> me"
* "<Body> <p> Hello </p> <p> mE </p> </body>" format.
* When passing parameters, the index must be written as follows. The settings that do not need to be modified can be omitted.
*
* $ Param = array (
* 'Html' => '', // required
* 'Options' => array (
* 'Tagarray' => array ();
* 'Type' => 'nest ',
* 'Length' => null,
* 'Lowertag' => TRUE,
* 'Htmlfix' => TRUE,
*)
*);
* FixHtmlTag ($ param );
*
* The meanings of the index values are as follows:
* String $ html code to be modified
* Array $ tagArray when nested mode is used, the near closed tag array is required.
* String $ type mode name. Currently, NEST and CLOSE modes are supported. If it is set to CLOSE, the $ tagArray parameter is ignored, and all labels are closed nearby.
* Ini $ length if you want to truncation a certain length, you can assign a value here. This length refers to the string length.
* Whether the bool $ lowerTag converts all tags in the code to lowercase. The default value is TRUE.
* Whether the bool $ XHtmlFix processes tags that do not comply with the XHTML specification, that is, converting <br> to <br/>
*
* @ Author IT tumbler <itbudaoweng@gmail.com>
* @ Version 0.2
* @ Link http://yungbo.com IT tumbler
* @ Link: http://enenba.com /? Post = 19 XX
* @ Param array $ param array parameter, which must be assigned a specific index
* @ Return string $ html code after result Processing
* @ Since 2012-04-14
*/
Function fixHtmlTag ($ param = array ()){
// Default value of the Parameter
$ Html = '';
$ TagArray = array ();
$ Type = 'nest ';
$ Length = null;
$ LowerTag = TRUE;
$ XHtmlFix = TRUE;

// First obtain the one-dimensional array, that is, $ html and $ options (if parameters are provided)
Extract ($ param );

// Extract related variables if options exists
If (isset ($ options )){
Extract ($ options );
}

$ Result = ''; // the html code to be returned.
$ TagStack = array (); // tag stack, simulated using array_push () and array_pop ()
$ Contents = array (); // used to store html tags
$ Len = 0; // initial String Length

// Set the closure flag $ isClosed. The default value is TRUE. If you need to close the tag nearby, the value is false after the start tag is successfully matched, and true after the tag is closed.
$ IsClosed = true;

// Convert all tags to be processed to lowercase.
$ TagArray = array_map ('strtolower ', $ tagArray );

// "Valid" Single Closed tag
$ SingleTagArray = array (
'<Meta ',
'<Link ',
'<Base ',
'<Br ',
'<Hr ',
'<Input ',
');

// Check the matching mode $ type. The default mode is NEST.
$ Type = strtoupper ($ type );
If (! In_array ($ type, array ('nest ', 'close '))){
$ Type = 'nest ';
}

// Use a pair of <and> as the separator to put the original html Tag and the string in the tag into the Array
$ Contents = preg_split ("/(<[^>] +?>) /Si ", $ html,-1, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE );

Foreach ($ contents as $ tag ){
If (''= trim ($ tag )){
$ Result. = $ tag;
Continue;
}

// Matches Standard Single-closed tags, as shown in <br/>
If (preg_match ("/<(\ w +) [^ \/>] *? \/>/Si ", $ tag )){
$ Result. = $ tag;
Continue;
}

// Match the start tag. If it is a single tag, the stack is output.
Else if (preg_match ("/<(\ w +) [^ \/>] *?> /Si ", $ tag, $ match )){
// If the last tag is not closed and the last tag belongs to the nearest closed type
// Close. The previous tag is output from the stack.

// If the tag is not closed
If (false ===$ isClosed ){
// Close all labels in proximity mode.
If ('close' = $ type ){
$ Result. = '</'. end ($ tagStack). '> ';
Array_pop ($ tagStack );
}
// The default nested mode, which is the tag provided by the nearby closed Parameter
Else {
If (in_array (end ($ tagStack), $ tagArray )){
$ Result. = '</'. end ($ tagStack). '> ';
Array_pop ($ tagStack );
}
}
}

// If the parameter $ lowerTag is TRUE, convert the tag name to lowercase.
$ MatchLower = $ lowerTag = TRUE? Strtolower ($ match [1]): $ match [1];

$ Tag = str_replace ('<'. $ match [1], '<'. $ matchLower, $ tag );
// Start a new tag combination
$ Result. = $ tag;
Array_push ($ tagStack, $ matchLower );

// If it belongs to the agreed single tag, close it and exit the stack
Foreach ($ singleTagArray as $ singleTag ){
If (stripos ($ tag, $ singleTag )! = False ){
If ($ XHtmlFix = TRUE ){
$ Tag = str_replace ('>', '/>', $ tag );
}
Array_pop ($ tagStack );
}
}

// Close nearby mode. The status changes to unclosed
If ('close' = $ type ){
$ IsClosed = false;
}
// Default nesting mode. If the tag is in the $ tagArray provided, the status is changed to unclosed.
Else {
If (in_array ($ matchLower, $ tagArray )){
$ IsClosed = false;
}
}
Unset ($ matchLower );
}

// Match the closed tag. If applicable, the tag is output to the stack.
Else if (preg_match ("/<\/(\ w +) [^ \/>] *?> /Si ", $ tag, $ match )){

// If the parameter $ lowerTag is TRUE, convert the tag name to lowercase.
$ MatchLower = $ lowerTag = TRUE? Strtolower ($ match [1]): $ match [1];

If (end ($ tagStack) ==$ matchLower ){
$ IsClosed = true; // The matching is complete and the tag is closed.
$ Tag = str_replace ('</'. $ match [1], '</'. $ matchLower, $ tag );
$ Result. = $ tag;
Array_pop ($ tagStack );
}
Unset ($ matchLower );
}

// Match the comments and connect directly to $ result
Else if (preg_match ("/<! --.*? -->/Si ", $ tag )){
$ Result. = $ tag;
}

// Put the string into $ result and perform the truncation operation.
Else {
If (is_null ($ length) | $ len + mb_strlen ($ tag) <$ length ){
$ Result. = $ tag;
$ Len + = mb_strlen ($ tag );
} Else {
$ Str = mb_substr ($ tag, 0, $ length-$ len + 1 );
$ Result. = $ str;
Break;
}
}
}

// If you still need to connect unclosed tags in the stack to $ result
While (! Empty ($ tagStack )){
$ Result. = '</'. array_pop ($ tagStack). '> ';
}
Return $ result;
}

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.