PHP implementation of HTML tag closure detection and processing

Source: Internet
Author: User

How does PHP implement HTML tag closure detection and processing? This paper mainly introduces the implementation of HTML tag closure detection and repair method in PHP, which can realize the detection and completion of the end tag in HTML tags. We hope to help you.

In this paper, the implementation of HTML tag closure detection and repair method in PHP is described. Share to everyone for your reference. Specific as follows:

HTML tag closure detection and repair, said a bit large, and did not consider a very complete, no use of regular expressions, applicable to the HTML file only the start tag does not have the end tag, there is an end tag is not the beginning of the tag case. The position of the label closure needs to be adjusted according to demand


<?php$str = ' <p data= ' <li></li> ' > <img src= ' http://www.baidu.com/123123.png '/> <p2> <a>content</a> </p2> <ul> <li> </li> </ul> <p> con Tent full </p> This is content</test1> the is content</test2> <test4 data= "liujinjing" &G T This is cont <li></li> <test3 data= "Liujinjing" > The IS CONTENT&LT;P3&GT;&LT;/P3&GT;&LT;/P4&GT;&L T;/P&GT;&LT;/P&GT;&LT;P6 style= "width:90px; "> This is Content"; $str _len = strlen ($STR);//record start tag $pre_data = Array ();//record start label position $pre_pos = Array (); $last _data = Arra Y (), $error _data = Array (), $error _pos = Array (), $i = 0;//marked < start $start_flag = False;while ($i < $str _len) {if ($st r[$i]== "<" && $str [$i +1]!= '/' && $str [$i +1]!= '! ')    {$i + +;    $_tmp_str = ";    Mark as < start $start _flag = true;    Mark Blank $space _flag = false; while ($str [$i]!= > && $str [$i]!= "'" && $str [$i]!= ' "' && $str [$i]! = '/' && $i < $str _len) {if ($str [$i]== ') {$space _fla      G = true;        } if (! $space _flag) {$_tmp_str. = $str [$i];    } $i + +;    } $pre _data[] = $_tmp_str;  $pre _pos[] = $i;    } else if ($str [$i]== < && $str [$i +1]== '/') {$i + = 2;    $_tmp_str = ";      while ($str [$i]!= > && $i < $str _len) {$_tmp_str. = $str [$i];    $i + +;    } $last _data[] = $_tmp_str;      View the previous value of the start tag if (count ($pre _data) >0) {$last _pre_node = Getlastnode ($pre _data, 1);        if ($last _pre_node = = $_tmp_str) {//pairing, delete the corresponding position value Array_pop ($pre _data);        Array_pop ($pre _pos);      Array_pop ($last _data);        } else {//no pairing on, there are two cases//condition one: only closed label, no start tag//Case two: Only start tag, no closed tag array_pop ($last _data);        $error _data[] = $_tmp_str;      $error _pos[] = $i;        }} else {Array_pop ($last _data);        $error _data[] = $_tmp_str;$error _pos[] = $i; }}else if ($str [$i]== < && $str [$i +1]== "!")    {$i + +;        while ($i < $str _len) {if ($str [$i]== "-" && $str [$i +1]== "-" && $str [$i +2]== ">") {$i + +;      Break      } else {$i + +;  }} $i + +;      }else if ($str [$i]== '/' && $str [$i +1]== ' > ') {//skips automatic single closed label if ($start _flag) {array_pop ($pre _data);      Array_pop ($pre _pos);    $i +=2;    }}else if ($str [$i]== "/" && $str [$i +1]== "*") {$i + +;        while ($i < $str _len) {if ($str [$i]== "*" && $str [$i +1]== "/") {$i + +;      Break    } else {$i + +;  } $i + +;    }}else if ($str [$i]== "'") {$i + +;    while ($str [$i]!= "'" && $i < $str _len) {$i + +;  } $i + +;    } else if ($str [$i]== ' "') {$i + +;    while ($str [$i]!= ' "' && $i < $str _len) {$i + +;  } $i + +;  } else {$i + +;  }}//determines the position of the starting label function Confirm_pre_pos ($STR, $pre _pos) {$str _len = strlen ($STR);  $j = $pre _pos; WHile ($j < $str _len) {if ($str [$j] = = ' "') {$j + +;          while ($j < $str _len) {if ($str [$j]== ' "') {$j + +);        Break      } $j + +;      }} else if ($str [$j] = = "'") {$j + +;          while ($j < $str _len) {if ($str [$j]== "'") {$j + +;        Break      } $j + +;      }} else if ($str [$j]== ">") {$j + +;          while ($j < $str _len) {if ($str [$j]== < ") {//returns to the original content location $j-;        Break      } $j + +;    } break;    } else {$j + +; }} return $j;}  Determine the position of the starting label function Confirm_err_pos ($STR, $err _pos) {$j = $err _pos;  $j--;      while ($j > 0) {if ($str [$j] = = ' "') {$j--;          while ($j < $str _len) {if ($str [$j]== ' "') {$j--;        Break      } $j--;      }} else if ($str [$j] = = "'") {$j-;          while ($j < $str _len) {if ($str [$j]== "'") {$j-;        Break      } $j--;   } } else if ($str [$j]== ">") {$j + +;    Break    } else {$j-; }} return $j;}  Gets the inverse of the array of num value function getlastnode (array $arr, $num) {$len = count ($arr);  if ($len > $num) {return $arr [$len-$num];  } else {return $arr [0]; }}//collation data, mainly backward looking, further check function sort_data (& $pre _data, & $pre _pos, & $error _data, & $error _pos) {$rem _  Key_array = Array ();  $rem _i_array = Array ();    Get the value that needs to be deleted foreach ($error _data as $key + = $value) {$count = count ($pre _data); for ($i = ($count-1), $i >=0; $i-) {if ($pre _data[$i] = = $value &&!in_array ($i, $rem _i_array)) {$re        M_key_array[] = $key;        $rem _i_array[] = $i;      Break    }}}//delete the corresponding value of the start tag foreach ($rem _key_array as $_item) {unset ($error _pos[$_item]);  Unset ($error _data[$_item]);    }//delete end tag corresponding value of foreach ($rem _i_array as $_item) {unset ($pre _data[$_item]);  Unset ($pre _pos[$_item]); }}//collation data, closed label function Modify_data ($STR, $pre _data, $pre _pos, $error_data, $error _pos) {$move _log = array ();    Only closed label data foreach ($error _data as $key + = $value) {//code ... $_tmp_move_count = 0; foreach ($move _log as $pos _key = $move _value) {//code ... if ($error _pos[$key]>= $pos _key) {$_tmp_      Move_count + = $move _value;    }} $data = Insert_data ($str, $value, $error _pos[$key]+$_tmp_move_count, false);    $STR = $data [' str '];  $move _log[$data [' pos ']] = $data [' Move_count '];    }//Only start tag data foreach ($pre _data as $key + = $value) {//code ... $_tmp_move_count = 0; foreach ($move _log as $pos _key = $move _value) {//code ... if ($pre _pos[$key]>= $pos _key) {$_tmp_mo      Ve_count + = $move _value;    }} $data = Insert_data ($str, $value, $pre _pos[$key]+$_tmp_move_count, True);    $STR = $data [' str '];  $move _log[$data [' pos ']] = $data [' Move_count ']; } return $str;}  Insert data, $type indicate how the data is inserted function Insert_data ($str, $insert _data, $pos, $type) {$len = strlen ($STR); Start label Type IF ($type ==true) {$move _count = strlen ($insert _data) +3;    $pos = Confirm_pre_pos ($str, $pos);    $pre _str = substr ($str, 0, $pos);    $end _str = substr ($str, $pos); $mid _str = "</". $insert _data.  ">";    Closed label Type} else {$pos = Confirm_err_pos ($str, $pos);    $move _count = strlen ($insert _data) + 2;    $pre _str = substr ($str, 0, $pos);    $end _str = substr ($str, $pos); $mid _str = "<". $insert _data.  ">";  } $str = $pre _str. $mid _str. $end _str; return Array (' str ' = $str, ' pos ' = = $pos, ' move_count ' + $move _count);} Sort_data ($pre _data, $pre _pos, $error _data, $error _pos); $new _str = Modify_data ($str, $pre _data, $pre _pos, $error _data, $error _pos), Echo $new _str;//print_r ($pre _data);//Print_r ($pre _pos);//Print_r ($error _data);//Print_r ($error _pos) ;//Echo strlen ($STR);//foreach ($pre _pos as $value) {//$value = Confirm_pre_pos ($str, $value);//for ($i = $value-5; $i Lt;= $value; $i + +) {//Echo $str [$i];//}//echo "\ n";//}//foreach ($error _pos as $valUE) {//for ($i = $value-5; $i <= $value; $i + +) {//Echo $str [$i];//}//echo "\ n";//}?> 

Related recommendations:

PHP detects whether a PNG image is a complete instance code

PHP detection of file type functions

PHP Detection character encoding Code _php tutorial

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

Tags Index: