Introduction: This is a detailed page for PHP tokenizer study notes. It introduces PHP, related knowledge, skills, experience, and some PHP source code.
Class = 'pingjiaf' frameborder = '0' src = 'HTTP: // biancheng.dnbc?info/pingjia.php? Id = 323132 'rolling = 'no'>
Brief Introduction
In a project, you need to analyze the PHP code and separate the corresponding function calls (and the location of the source code ). Although this can also be implemented using regular expressions, this is not the optimal method in terms of efficiency or Code complexity.
After querying the PHP manual, I found that PHP has a built-in parser interface, that is, PHP tokenizer. This tool is exactly what I want. With PHP tokenizer, you can analyze the composition of PHP source code in a simple, efficient, and accurate manner.
Instance
There are very few official documents on tokenizer, but this does not affect our understanding of it. The tokenizer component only contains two functions: token_get_all and token_name, which are used to analyze the PHP code and obtain the identifier name corresponding to the code.
The following is a simple example to illustrate how to use these two functions:
Reference content is as follows: $ Code = '<? PHP echo "string1". "string2";?> '; $ Tokens = token_get_all ($ Code); foreach ($ tokens as $ token) {If (is_array ($ token )) {// row number, identifier literal, corresponding content printf ("% d-% s \ t % s \ n", $ token [2], token_name ($ token [0]), $ token [1]) ;}} |
The corresponding output is
Reference content is as follows: 1 - T_OPEN_TAG <?php 1 - T_ECHO echo1 - T_WHITESPACE 1 - T_CONSTANT_ENCAPSED_STRING "string1"1 - T_CONSTANT_ENCAPSED_STRING "string2"1 - T_WHITESPACE 1 - T_CLOSE_TAG ?> |
By the way, if $ token is an array, the corresponding three group members are the token identifier (which can be obtained literally using token_name), the corresponding source code content, and the corresponding row number.
In addition, $ token is a string, which may be a constant such as t_constant_encapsed_string. Pay attention to this when analyzing the code. If you are very concerned about this, consider using the code here.
Yes, the call method is very simple. Of course, our ambition is much greater than writing a simple loop. We can use this component to write things. For example, the following code is used to "compress" PHP Code to remove unnecessary line breaks, blank spaces, and comments.
Reference content is as follows: /*** "Compress" PHP source code ** @ see http://c7y.phparch.com/c/entry/1/art,practical_uses_tokenizer */class compactcode {static protected $ out; static protected $ tokens; static public function compact ($ source) {// parse the PHP source code self ::$ tokens = token_get_all ($ source); self ::$ out = ''; reset (Self ::$ tokens ); // recursively determine the type of each token while ($ T = Current (SELF: $ tokens) {If (is_array ($ t )) {// filter blank and comment if ($ t [0] = t_whitespace | $ t [0] = t_doc_comment | $ t [0] = t_comment) {self: skipwhiteandcomments (); continue;} self ::$ out. = $ t [1];} else {self: $ out. = $ t;} next (SELF: $ tokens);} return self: $ out;} static private function skipwhiteandcomments () {// Add a space, used to split the keyword self: $ out. = ''; while ($ T = Current (SELF: $ tokens) {// greedy search again if (is_array ($ T) & ($ t [0] = t_whitespace | $ t [0] = t_doc_comment | $ t [0] = t_comment) {next (SELF :: $ tokens) ;}else {return ;}}}} |
The call method is simple. You only need to use
Reference content is as follows: Compactcode: Compact ($ source_code ); |
The returned string is the compressed content. Here are more examples of using tokenizer. We recommend that you read it.
More articles about "php tokenizer learning notes"
Love J2EE follow Java Michael Jackson video station JSON online tools
Http://biancheng.dnbcw.info/php/323132.html pageno: 16.