Lua Performance Optimization Techniques (V): Cutting, reusing, and recycling _lua

Source: Internet
Author: User

When dealing with LUA resources, we should also follow the 3R principles--reduce for Earth resources, reuse and recycle, i.e., reduction, reuse, and recycling.

Cutting is the easiest way. There are many ways to avoid using new objects, for example, if your program uses too many tables, you can consider changing the presentation of the data. As a simple example, suppose your program needs to manipulate polylines, the most natural form of expression is:

Copy Code code as follows:

Polyline =
{
{x = 10.3, y = 98.5},
{x = 10.3, y = 18.3},
{x = 15.0, y = 98.5},
--...
}

Naturally, this form of presentation is not economical for a large polyline, as each of its points needs to be described with a table. The first alternative is to use an array to record, you can save a bit of memory:

Copy Code code as follows:

Polyline =
{
{10.3, 98.5},
{10.3, 18.3},
{15.0, 98.5},
--...
}

For a 1 million-point polyline, this modification can reduce the memory footprint from 95KB to 65KB. Of course, you need to pay a price for readability: p[i].x is easier to understand than p[i][1.

Another more economical approach is to use an array to store all the x-coordinate, and the other to store all y-coordinates:

Copy Code code as follows:

Polyline =
{
x = {10.3, 10.3, 15.0, ...},
y = {98.5, 18.3, 98.5, ...}
}

Original
Copy Code code as follows:

p[i].x

Now it's turned into
Copy Code code as follows:

P.x[i]

Using this representation, the memory footprint of the 1 million-point polyline is reduced to 24KB.

Loops are a good place to look for opportunities to reduce the number of garbage collections. For example, if you create a table in a loop that doesn't change, you can move it outside the loop and even move it outside of the function as a value. Trial comparison:

Copy Code code as follows:

function foo (...)
For i = 1, n do
Local T = {1, 2, 3, "HI"}
--do something that doesn't change the T-watch.
--...
End
End

And
Copy Code code as follows:

Local T = {1, 2, 3, "HI"}--Create t, once and for all
function foo (...)
For i = 1, n do
--do something that doesn't change the T-watch.
--...
End
End

The same technique can also be used for closures, as long as you don't move them beyond the scope that needs them. For example, the following function:

Copy Code code as follows:

function changenumbers (limit, Delta)
For line in Io.lines () do
line = String.gsub (line, "%d+", function (num)
num = tonumber (num)
If num >= limit then return tostring (num + delta) end
--otherwise do not return any value, keep the original value
End
Io.write (line, "\ n")
End
End

We can avoid creating new closures for each iteration by moving the internal functions outside the loop:

Copy Code code as follows:

function changenumbers (limit, Delta)
local function aux (num)
num = tonumber (num)
If num >= limit then return tostring (num + delta) end
End
For line in Io.lines () do
line = String.gsub (line, "%d+", aux)
Io.write (line, "\ n")
End
End

However, we cannot move the aux beyond the changenumbers function because aux needs to access limit and delta.

For a variety of string processing, we can reduce the need to create a new string by using an index of an existing string. For example, the String.find function returns the index of the location where it finds the specified pattern, rather than the string to which it is matched. By returning an index, it avoids creating a new string when a successful match is made. When necessary, the programmer can get a matching substring [1] by calling String.sub.

When we cannot avoid the use of new objects, we can still avoid creating new objects by reusing them. For strings, reuse is not necessary, because Lua has done the work for us: it always internalization all the used strings and reuse them whenever possible. However, reuse can be very effective for tables. To give a general example, let's go back to creating a table in a loop. This time, the contents of the table are no longer the same. Often we can reuse this table in all iterations, simply changing its contents. Consider the following code snippet:

Copy Code code as follows:

Local T = {}
For i = 1970
T[i] = Os.time ({year = i, month = 6, day = 14})
End

The following code is equivalent, but the table is reused:
Copy Code code as follows:

Local T = {}
Local aux = {year = nil, month = 6, day = 14}
For i = 1970
Aux.year = i
T[i] = Os.time (aux)
End

A particularly effective way to implement reuse is to cache [2]. The basic idea is very simple, store the calculated results for the specified input, and the next time you accept the same input again, the program simply reuses the last calculated result.

Lpeg,lua a new pattern-matching library, which uses an interesting caching process. Lpeg compiles each pattern string into an internal applet for matching strings, which is expensive compared to the match itself, so lpeg caches the results of the compilation for reuse. Just a simple table, with the pattern string as the key, the compiled applet for the value record.

A common problem with caching is that the memory overhead of storing the results of a calculation is greater than the performance improvement of reuse. To solve this problem, we can use a weak table in Lua to record the results, so the results that are not used will eventually be recycled.

In Lua, with higher-order functions, we can define a common caching function:

Copy Code code as follows:

function Memoize (f)
Local mem = {}--Cached table
Setmetatable (mem, {__mode = "kv"})--Set as weak table
return function (x)--new version of ' F ' cached
Local r = mem[x]
if r = = Nil Then--no previously recorded results?
R = f (x)--Call the original function
MEM[X] = R--store results for reuse
End
Return r
End
End

For any function f,memoize (f) Returns the same return value as F, but it is cached. For example, we can redefine loadstring as a cached version:

LoadString = Memoize (loadstring)
New functions are used in exactly the same way as old, but performance can be greatly enhanced if there are many duplicate strings that are loaded.

If your program creates and deletes too many threads, recycling will probably improve its performance. The existing Coprocessor API does not directly provide support for reusing the coprocessor, but we can try to circumvent this limitation. For the following coprocessor:

Copy Code code as follows:

CO = coroutine.create (function (f)
While F do
f = Coroutine.yield (f ())
End
End

This process accepts a task (running a function), executes it, and waits for the next task when it is completed.

Most of the collections in Lua are done automatically through the garbage collector. LUA uses the progressive garbage collector, which means that garbage collection work is broken down into small steps (incrementally) in the process of allowing the program to execute. The gradual rhythm is proportional to the speed of memory allocation, and whenever a certain amount of memory is allocated, the corresponding memory is recycled proportionally; the faster the program consumes memory, the faster the garbage collector tries to reclaim memory.

If we follow the principle of reduction and reuse in writing programs, there is usually not much to do with the garbage collector. But sometimes we can't avoid making a lot of rubbish, and the work of the garbage collector will become very heavy. The garbage collector in Lua is tuned to the average level of the program, so it works well in most programs. However, at specific times we can adjust the garbage collector to get better performance. Control the garbage collector by calling the function CollectGarbage in Lua, or by calling LUA_GC in C. They have the same functionality, except that they have different interfaces. In this case I'm going to use the Lua interface, but this is usually better in C.

The CollectGarbage function provides several functions: it can stop or start the garbage collector, force a complete garbage collection, get the total memory that Lua occupies, or modify two parameters that affect the work rhythm of the garbage collector. They are useful when adjusting programs with high memory consumption.

"Forever" stopping the garbage collector may be useful for some batch programs. These programs create several data structures, produce some output values based on them, and then exit (such as compilers). For such a program, trying to recycle garbage will be a waste of time, because the amount of garbage is very small and the memory will be released completely after the program has finished executing.

For non-batch programs, stopping the garbage collector is not a good idea. However, these programs can pause the garbage collector during periods of extreme time sensitivity to improve time performance. If necessary, these programs can get full control of the garbage collector so that it is always in a stopped state, with only an explicit mandatory stepping or full garbage collection at specific times. For example, many event-driven platforms provide an option to set idle functions that can be invoked when no messages need to be processed. This is the perfect time to call garbage collection (in Lua 5.1, every time you make a forced recycle while the garbage collector is stopped, it goes back to work, so if you want to keep the garbage collector in a stopped state, you must call CollectGarbage ("Stop") immediately after the collection is forced.

Finally, you may want to implement tuning the collector's parameters. The garbage collector has two parameters to control its rhythm: the first, called the pause time, the control collector waits before it completes a recycle and before the next recycle, and the second parameter, called the Step factor, controls how much content the collector reclaims per step. Roughly speaking, the smaller the pause time, the greater the stepping coefficient, the faster the garbage collection. The impact of these parameters on the overall performance of the program is unpredictable, and the faster garbage collector obviously wastes more CPU cycles, but it lowers the amount of memory consumed by the program and may reduce paging. Only careful testing can give you the best parameter values.

[1] It might be a good idea if the standard library provides a function to contrast two substrings, so that we can check for specific values in the string without having to solve the substring (creating a new string).

[2] Cache, original memoize

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.