Recently, I took the time to change to SyntaxHighlighter. Because the label at the beginning of the coolcode plug-in is
<Coolcode>
Or [coolcode], and SyntaxHighlighter is
[Code lang = "php"]
This (or other ). You can only convert the old format into a new one. Of course, regular expressions are certainly used.
The original code is marked as highlighted at the beginning
<Coolcode lang = "php" download = "123.php" linenum =" on "> <coolcode lang =" php "linenum =" off "> <coolcode lang =" php ">
This type,
SyntaxHighlighter is identified
[Code lang = "php"]
The required regular expression is
<Coolcode lang = "[a-z] +". *?>
ExplainCopy codeThe Code is as follows: [a-z] + matches php, javascript, cpp, SQL, css, and so on .*? '.' Indicates any character except line breaks, and '*' indicates 0 or countless times. '+ the symbols of these expressions followed? Identify non-Greedy Mode
As shown in the figure, this regular expression can meet the above requirements.
However, the problem has not been solved, and we still have to consider it, that is
<Coolcode
The following is not necessarily an attribute like lang = "php". It may be download or linenum = "on/off, we also need to modify the regular expression.
Change the regular expression to CFC4N.
<Coolcode .*? Lang = "[a-z] +". *?>
As follows:
Careful friends may see that there are more matching red boxes in the figure.
<Coolcode
, That is, the preceding
<Coolcode>
It must be excluded. How can this problem be ruled out? If you are smart, you must think immediately. replace this omnipotent character with a non-<> Two-symbol rule. Well, change the CFC4N immediately.
The modified regular expression is
<Coolcode .*? Lang = "[a-z] +". *?>
Sure enough, the matching is normal. For the result, see.
At this point, the problem seems to have been solved. However, if I was confused at the beginning, I used both the beginning identifiers of coolcode, that is
<Coolcode
And [coolcode, so how do you think this regular expression should be rewritten?
That's right, it's nothing more than the beginning. There are two types of logo at the end <and [, so the regular expression will be changed. (Don't forget to exclude the symbols in the rule)
[<\ [] Coolcode [^ <> \ [\] *? Lang = "[a-z] +" [^ <> \ [\] *? [> \]
Well, let's look at the effect:
Perfect.
Next, you can execute it.
However, I encountered a very unexpected thing. I found that the old Code contains such a format.
[Coolcode linenum = \ "off \" lang = \ "cpp \"] <coolcode download = \ "\" lang = \ "cpp \" linenum = "off">
Er, the problem is here, but there is only one escape character \, so it is easy to change. That is to say, 0 or once is allowed, and the mark of 0 or 1 is ?, Then we add a question directly after ?, That is, change \? Is that all right?
Apparently, no. In a regular expression, "\" also indicates escape. If '\' is matched, escape \\? This is true.
After modification, the regular expression is
[<\ [] Coolcode [^ <> \ [\] *? Lang = \\? "[A-z] + \\? "[^ <> \ [\] *? [> \]
For the matching results, see:
Now we are done. We can perform conversion. We can use two methods for conversion.
• REPLACE function of Mysql, single replacement
<Coolcode lang = "php/cpp/javascript/SQL/css, etc." download = "name" linenum = "on/off">
For
[Code lang = "php/cpp/javascript/SQL/css"]
In this way, the province needs to write programs, fetch, replace, and write again. The disadvantage is that it is large, manual, and physical. Mysql only supports regular queries and does not support replacement of regular queries. We can also construct a composite nested SQL statement to replace the regular expression matching string, however, the php/cpp/javascrip language mark cannot be extracted and replaced with the new language mark. That is to say, mysql does not support reverse references of regular expressions.
• PHP reads the database, replaces the database, and then writes the database. PHP's preg_replace function supports reverse reference (preg_replace does not support reverse reference of custom group names). We have to write a query statement to query articles containing the coolcode identifier and then replace them. Of course, there may be too many articles that directly query data that contain coolcode. We can also write a POSIX regular engine expression supported by MYSQL to match the article that uses the coolcode tag, replace it with it, and write it. To reduce the number of articles. Of course, regular expressions also waste a lot of resources.
Of course, when the preg_replace function of PHP Code uses the above regular expression for reverse reference, you need to slightly modify the regular expression. Name of a group in the middle of lang =. Modify the regular expression
[<\ [] Coolcode [^ <> \ [\] *? Lang = \\? "([A-z] + \\?) "[^ <> \ [\] *? [> \]
PHP replacement code is
$ Contents = preg_replace ('/[<| [] coolcode [^> [\] *? Lang = \\\\? "([^"] + ?) \\\\? "[^> [\] *? [> | \]/I ',' [code lang = "\ 1" ', $ contents );
The regular I modifier is case-insensitive.
Also, do not forget the end ID of coolcode and replace [/coolcode] with [/code].
Execute two SQL statements in mysql.Copy codeThe code is as follows: UPDATE wp_posts SET post_content = REPLACE (post_content, '</coolcode>', '[\/code]'); // note that multiple backslashes are added. Remember to remove them.
UPDATE wp_posts SET post_content = REPLACE (post_content, '[/coolcode]', '[\/code]'); // note that a backslash is added later. Remember to remove it.
Summary:
The regular expressions involved in this Article do not have advanced usage. They are usually simple usage. For recursive (iterative), group name, reverse reference, and zero-width assertion of PCRE engine regular expressions, CFC4N will find appropriate examples in the future. Of course, for these advanced usage, CFC4N has already been used in the regular expression written by a friend. You can take a look at it. You are welcome to criticize and give pointers.
PS: If you need to convert coolcode to the complete PHP program of SyntaxHighlighter, leave a message and I will write it out.