Before running any code, LUA translates the source code translation (precompiled) into an internal format. This format is a virtual machine instruction sequence similar to the machine code executed by the real CPU. The internal format is then interpreted by a C code that consists of a while loop that contains a huge switch structure, and each case in the switch corresponds to a single instruction.
As you may have learned elsewhere, LUA uses a register-based virtual machine starting with version 5.0. The virtual machine "registers" here are not the same as the real CPU registers because they are difficult to migrate and are very limited in number. LUA provides registers by using a stack (implemented with an array and several indexes). Each active function has an activation record, that is, a fragment on the stack that is available for the function to store the register. Therefore, each function has its own register [1]. A function can use up to 250 registers, because only 8 bits per instruction is used to reference a register.
Because of the large number of registers, the LUA precompiled compiler can save all local variables in registers. The benefit of this is that accessing local variables can be very fast. For example, if A and B are local variables, the statement
Copy Code code as follows:
Only one instruction will be generated:
Copy Code code as follows:
(Suppose A and B correspond to 0 and 1 respectively in registers). In contrast, if both A and B are global variables, then this code will become:
Copy Code code as follows:
Getglobal 0 0; A
Getglobal 1 1; B
ADD 0 0 1
Setglobal 0 0; A
As a result, it is easy to draw the most important performance optimization approach to LUA programming: using local Variables!
If you want to squeeze the performance of the program, there are many places you can use this method. For example, if you want to call a function in a long loop, you can assign the function to a local variable in advance. For example, the following code:
Copy Code code as follows:
For i = 1, 1000000 do
Local x = Math.sin (i)
End
30% slower than the following paragraph:
Copy Code code as follows:
Local sin = Math.sin
For i = 1, 1000000 do
Local x = sin (i)
End
Accessing the external local variable (or the top value of the function) is not as fast as accessing the local variable directly, but it is still faster than accessing the global variable. For example, the following code fragment:
Copy Code code as follows:
function foo (x)
For i = 1, 1000000 do
x = x + Math.sin (i)
End
return x
End
Print (foo (10))
Can be optimized to declare a sin outside of Foo:
Copy Code code as follows:
Local sin = Math.sin
function foo (x)
For i = 1, 1000000 do
x = x + sin (i)
End
return x
End
Print (foo (10))
The second piece of code is 30% faster than the former.
Although the LUA compiler is very efficient compared to other language compilers, compiling is still a heavy physical activity. Therefore, you should try to avoid run-time compilation (for example, using the LoadString function) unless you really need code that is so dynamically required, such as code entered by the user. There are very few situations where you need to dynamically compile your code.
For example, the following code creates a table that contains several functions that return constant values 1 through 100000:
Copy Code code as follows:
Local Lim = 10000
Local A = {}
For i = 1, lim do
A[i] = loadstring (String.Format ("Return%d", i))
End
Print (a[10] ())--> 10
It takes 1.4 seconds to execute this code.
By using closures, we can avoid using dynamic compilation. The following code only takes one-tenth of the time to do the same work:
Copy Code code as follows:
function FK (k)
return function () return K-end
End
Local Lim = 100000
Local A = {}
For i = 1, lim does a[i] = FK (i) end
Print (a[10] ())--> 10