In Javascript, functions can be easily serialized (stringized), that is, the function source code is obtained. however, the internal implementation (engine implementation) of this operation is not as simple as you think. spiderMonkey has used two types of function serialization technology: one is to decompile the compiled bytecode of the function into the source code string using the decompiler, the other is to compress and store the function source code before compiling the function into a bytecode, and decompress and restore the function when it is used.
How to serialize Functions
In SpiderMonkey, there are three methods or functions that can serialize functions: Function. prototype. toString, Function. prototype. toSource, uneval. only the toString method is standard, that is, it is common to all engines. however, in the ES standard. prototype. the toString method (ES5 15.3.4.2) has only a few words. That is to say, there is basically no standard, and the engine decides how to implement it.
Function serialization
The main function of function serialization is to use the source code of the function generated by serialization to redefine this function.
Copy codeThe Code is as follows:
Function (){
...
Alert ("")
...
}
A () // "a" may pop up during execution"
A = eval ("(" + a. toString (). replace ('alert ("a") ', 'alert ("B")') + ")")
A () // "B" may pop up during execution"
You may think: "I have been writing Javascript For so many years, why haven't I met this requirement ". indeed, if it is your own website and js files that are fully controlled by you, you don't need to modify the function in this way of patching. You can directly modify the function. however, if the source file is not controlled by you, you may need to do so. for example, a greasemonkey script is commonly used: you may need to disable or modify a function in a website. there is also Firefox extension: You need to modify a function of Firefox itself (it can be said that Firefox is written in JS ). I wrote it myself.Firefox script example:
Copy codeThe Code is as follows: location = "chrome: // browser/content/browser. xul "& eval (" gURLBar. handleCommand = "+ gURLBar. handleCommand. toString (). replace (/^ \ s * (load. +);/gm, "/^ javascript :/. test (url) | (content. location = 'about: blank '| content. location = 'about: newtab ')? $1: gBrowser. loadOneTab (url, {postData: postData, inBackground: false, allowThirdPartyFixup: true });"))
This Code enables Firefox to open the page in the new tag when you press enter on the address bar, rather than occupying the current tag. the implementation method is to use the toString method to read the gURLBar. the source code of the handleCommand function is replaced with a regular expression and then passed to eval. The function is redefined.
Why not directly define the function, that is, directly rewrite the function:
GURLBar. handleCommand = function () {... // change the original function to a small place}
The reason for not doing so is that we have to consider compatibility. We should change the source code of this function as little as possible. if so, the gURLBar of Firefox. once the handleCommand source code changes, the script becomes invalid. for example, both Firefox3 and Firefox4 have this function, but the function content is very different. However, if some keywords are replaced by regular expressions, as long as the keyword to be replaced does not change, there will be no incompatibility.
Decompile bytecode
In SpiderMonkey, the function will be compiled into bytecode after it is parsed. That is to say, it is not the original function source code stored in the memory. spiderMonkey has an anti-compiler. Its main function is to decompile the function bytecode into the function source code form.
In Firefox16 and earlier versions, SpiderMonkey uses this method. If you are using Firefox of these versions, you can try the following code:
Copy codeThe Code is as follows:
Alert (function (){
"String ";
// Comment
Return 1 + 2 + 3
}. ToString ())
The returned string is
Function (){
Return 6;
}
The output is completely different from that of other browsers.:
1. the meaningless Original Value literal will be deleted during compilation. In this example, it is a "string ".
You may think: "It seems that there is no problem, but these values do not make any sense for function running ". wait, do you forget something that indicates the strict mode string "use strict?
In versions that do not support strict mode, such as Firefox3.6, this "use strict" is no different from other strings and will be deleted during compilation. after SpiderMonkey implements the strict mode, although this string "use strict" is ignored during compilation, it is determined during decompilation. If this function is in strict mode, "use strict" will be added to the first line of the function body. The following is the source code of the corresponding engine.
Static JSBool
Copy codeThe Code is as follows:
DecompileBody (JSPrinter * jp, JSScript * script, jsbytecode * pc)
{
/* Print a strict mode code directive, if needed .*/
If (script-> strictModeCode &&! Jp-> strict ){
If (jp-> fun & (jp-> fun-> flags & JSFUN_EXPR_CLOSURE )){
/*
* We have no syntax for strict function expressions;
* At least give a hint.
*/
Js_printf (jp, "\ t/* use strict */\ n ");
} Else {
Js_printf (jp, "\ t \" use strict \ "; \ n ");
}
Jp-> strict = true;
}
Jsbytecode * end = script-> code + script-> length;
Return DecompileCode (jp, script, pc, end-pc, 0 );
}
2. Annotations will also be deleted during compilation.
This does not seem to have much impact, but some people are willing to use function annotations to implement multi-line strings. This method is unavailable in versions earlier than Firefox 17.
Copy codeThe Code is as follows:
Function hereDoc (f ){
Return f. toString (). replace (/^. + \ s/, ""). replace (/. + $ /,"");
}
Var string = hereDoc (function (){/*
Me
You
He
*/});
Console. log (string)
Me
You
He
3. the literal operation of the original value will be performed during compilation..
This is an optimization method, as mentioned in high-performance JavaScript:
Disadvantages of Decompilation
Due to the emergence of new technologies (such as strict mode) and the frequent changes to the implementation of anti-compiler when modifying other related bugs, changes may generate new bugs, I personally encountered a bug. probably around Firefox10, the specific problem is not clearly remembered. It is about whether the parentheses should be retained during decompilation, probably like this:
Copy codeThe Code is as follows:
> (Function (a, B, c) {return (a + B) + c}). toString ()
"Function (a, B, c ){
Return a + B + c;
}"
During decompilation, parentheses in (a + B) are omitted. Since the addition combination law is left to right, it doesn't matter. But the bug I encountered is as follows:
Copy codeThe Code is as follows:
> (Function (a, B, c) {return a + (B + c)}). toString ()
"Function (a, B, c ){
Return a + B + c;
}"
This will not work. a + B + c is not equal to a + (B + c). For example, in the case of a = 1, B = 2, c = "3, a + B + c is equal to "33", and a + (B + c) is equal to "123 ".
Regarding the anti-compiler, Mozilla engineer Luke Wagner pointed out that the anti-compiler has a great obstacle for them to implement some new functions, and there are often some bugs:
Not to pile on, but I too have felt an immense drag from the decompiler in the last year. testing coverage is also poor and any non-trivial change inevitably produces fuzz bugs. the sooner we remove this drag the sooner we start reaping the benefits. in particle, I think now is a much better time to remove it than after doing significant frontend/bytecode hacking for new language features.
Brendan Eich also said that there are indeed many unsatisfactory anti-compiler:
I have no love for the decompiler, it has been hacked over for 17 years. storage function source code
After Firefox17, SpiderMonkey was changed to the second implementation method, which should be implemented by other browsers. the serialized strings are exactly the same as the source code, including blank spaces and comments. in this case, most problems should be eliminated. however, it seems that I have another question. or strict mode.
For example:
Copy codeThe Code is as follows:
(Function (){
"Use strict ";
Alert ("");
}) + ""
Of course, the returned source code should also contain "use strict", which is implemented by all browsers:
Copy codeThe Code is as follows:
Function (){
"Use strict ";
Alert ("");
}
But if so:
Copy codeThe Code is as follows:
(Function (){
"Use strict ";
Return function B (){
Alert ("B ")
}
}) () + ""
Internal function B is also in strict mode, and the source code of the output function B should not be added with "use strict". Try it:
As mentioned above, Versions later than Firefox4 prior to Firefox17 determine whether the current function is in strict mode and do not output "use strict". function B inherits the strict mode of function, so there will be "use strict ".
At the same time, the function source code is strictly indented, because during the decompilation, SpiderMonkey will format the decompiled source code, even if the source code is not indented at all, it doesn't matter:
Copy codeThe Code is as follows:
Function B (){
"Use strict ";
Alert ("B ");
}
Will Firefox17 and later versions contain "use strict? Because the source code of the function is saved directly, and function B does not contain the word "use strict. the test result is: "use strict" will be added, but the indentation is a bit problematic, because this step is not formatted.
Copy codeThe Code is as follows:
Function B (){
"Use strict ";
Alert ("B ")
}
The latest version of jsfun. cpp SpiderMonkey has corresponding comments in the source code.
// If a function has "use strict" in an upper-layer function, this function inherits the strict mode of the Upper-layer function.
// We will also insert "use strict" in the internal function ".
// This ensures that, if the return value of the toString method of this function is re-evaluated,
// The regenerated function has the same semantics as the original function.
The difference is that other browsers do not contain "use strict:
Copy codeThe Code is as follows:
Function B (){
Alert ("B ")
}
Although this does not have much impact, I think the implementation of Firefox is more reasonable.