How to use c api in Lua scripting language

Source: Internet
Author: User
Tags exception handling lua memory usage try catch

As an Embedded Language, Lua provides a complete c api for Lua code to interact with the host program. Of course, the host language should be C or C ++. For other languages, for example, embedding Lua in the mono environment that has been popular for the last two years is another theory.

Embedding Lua correctly is not easy to do. Many people who are new to Lua are prone to making mistakes. Fortunately, this language bridging work is done by the designers at the beginning of the project. No one needs to learn it. As long as there are people familiar with Lua, the danger of making mistakes will not be too great. And even if there is a problem, it is easier to modify it later. This blog is mainly about the location where mistakes are most likely to be made, and some correct (but seemingly troublesome) implementation methods.

The most easy to ignore is the processing of errors in Lua.

In Lua, it is called error, and in other languages, it is called exception.

If you have carefully read the Lua manual. You will find that all C APIs indicate whether the API throws an exception. For example, lua_tostring indicates [-0, + 1, e], which may throw an exception (is it different from your intuition ?); But lua_pushlightuserdata does not.

Lua exceptions should be captured by lua_pcall or lua_resume. Therefore, when you call a c api, make sure that you are in a lua_pcall or lua_resume at the call level of C. Therefore, even a few simple lines of code commonly used to create a Lua virtual machine may be written incorrectly. For example:

Lua_State * L = luaL_newstate ();
If (L ){
LuaL_openlibs (L );
}

This is not a week to consider writing. Because luaL_openlibs (L) may throw an exception so that it is not captured.

When an uncaptured exception occurs in lua, the panic function is called and abort () is called to exit the process. One remedy is to set a recovery point at the outermost layer of the framework: setjmp for the C language, and try catch for the C ++ language. In the panic method set by lua_atpanic, jump directly to the recovery point (longjmp and throw for C language) so that the panic function will never return. But this is not The recommended method. According to Lua's author, "The panic mode is only for ill-structured Lua programs .".

When you only use C to compile the Lua library, you can use a ready-made host program (such as the Lua official interpreter). This issue is usually not required. Because the C code block that you call Lua c api is directly or indirectly called by Lua. However, the Lua c api is easily ignored when it is deployed in the host program. The perfect solution is to write your business logic to a lua_CFunction and call it with lua_pcall. The parameters of this code block should be passed through lua_pushlightuserdata using void.

That's why Lua provided a c api called lua_cpcall in earlier versions. After Lua 5.2 supports light c functions, lua_pushcfunction can be used to pass in lua vm with no additional costs for C functions without upvalue, so a separate lua_cpcall is no longer needed.

The best example is the implementation of Lua's official interpreter: You should now understand why the main logic is written in a function called pmain instead of directly implementing it in main.

As mentioned above, lua_pushlightuserdata will not throw an exception, and other simple value types, such as lua_pushboolean lua_pushinteger, will not. This is because these APIs do not check the lua stack capacity and do not actively expand the Lua stack as needed. However, lua_pushstring, an API that requires the construction of new objects, may cause OOM (out of memory) exceptions. Lua only provides additional LUA_MINSTACK slots on the boundary from Lua to C. The default value is 20, which is generally enough. Because it is generally enough, it is easy to be ignored by students who write C extensions. Especially when C extension code contains recursion at the C level, Stack Overflow is very easy under boundary conditions. This bug is hard to detect because the Lua stack often leaves more space than LUA_MINSTACK. Remember: If you are doing complex tasks in the C extension, remember to use luaL_checkstack to reserve enough space you need before using lua stack.

When writing a Lua C extension Library in C, you need to consider that every time you call the Lua API, the next program may not run. Therefore, if you want to temporarily apply for some heap memory usage, you should fully consider that the code you write in the same function to release the temporary object may not run. The correct method is to use lua_newuserdata to apply for temporary memory. If exceptions interrupt, the subsequent gc process will clean them up. The luaL_Buffer-related library is based on this. Or you can use the pool to recycle it. In short, you need to consider this.

For the same reason, if you construct a C object, all fields in the object should be cleared (set to a valid initial value) before calling other Lua C APIs ), avoid setting fields by using Lua c api. For example:

Struct foobar {
Const char *;
Const char * B;
}

...

Struct foobar * f = lua_newuserdata (L, sizeof (* f ));

... // Some other work

F-> a = lua_tostring (L, 1 );
F-> B = lua_tostring (L, 2 );

Writing in this way is risky. Because the first call of lua_tostring may fail to execute the next line due to an exception, and f-> B is not initialized. The correct method is:

Struct foobar * f = lua_newuserdata (L, sizeof (* f ));
F-> a = NULL;
F-> B = NULL;

... // Some other work

F-> a = lua_tostring (L, 1 );
F-> B = lua_tostring (L, 2 );

If you have carefully read the lua source code, you will find that the internal implementation of Lua is often written in this way. Here, newuserdata can be used to avoid most initialization failures, but you must be sure that the f or the corresponding lua object can be passed elsewhere after the c object is correctly initialized, and a retriable can be added to userdata.

When the host language itself supports exceptions, it is difficult to coordinate the exception mechanism of the host language with the exception mechanism of Lua itself. It is almost impossible to coordinate two exception mechanisms by the database without interfering with the implementation of Lua itself. To solve this problem, Lua allows you to define a series of macros when building the database to use the exception mechanism of the host language to implement exception propagation of Lua.

Look at the macros such as LUAI_THROW LUAI_TRY in front of LDOs. c. Therefore, if you use C ++ as the host language, you should use the C ++ compiler to compile the Lua library. If you directly use the library compiled by C to link to the C ++ program (or share the compiled lua dynamic library), everything works normally on the surface. Once exception handling is involved, there will be many unknown problems.

This problem is caused:

When a Lua internal exception occurs, the VM will jump directly to the previously set recovery point on the C stack frame, and then the lua stack at the unwind lua vm level. The lua stack (CallInfo structure) is correct after exceptions are captured, but the processing of the C stack frame may not be as expected by your host program. That is, the RAII mechanism is probably not triggered.

The stack frame of btw and Lua does not correspond to the stack frame of C one by one, that is, a function call corresponding to a layer C function is not called at the Lua layer, when you pcall a Lua function on the lua layer and then pcall a lua function, it is not an intuitive two-layer try catch. This implementation of Lua is related to Lua's language features, tail recursion, and coroutine. If you want to install coroutine in the internal directory of pcall. when yield is returned to the C layer, it is absolutely impossible for the Lua function call to correspond to the C function call. Otherwise, coroutine cannot resume (because the recovery point is jumped back on the C layer, it destroys the stack frame of the C layer and cannot be rebuilt ). This is why the internal exception mechanism implemented by Lua cannot be simply compatible with the host language.

In other words, even if you re-compile the lua library with try catch. When you actively try catch outside the lua api that may throw an exception such as lua_pushstring, you can catch this exception (because the implementation of the specified lua vm also uses it ), but it will damage the work of lua vm itself.

Note: you cannot use throw instead of lua_error to throw an exception or try catch to replace lua_pcall. Changing to the exception mechanism of C ++ at the implementation level of Lua VM does not mean that lua and C ++ have an equivalent exception propagation system. When you understand that some lua APIs throw an exception and the exception is thrown by the throw of C ++, you should also understand that, it is an error to manually use the C ++ exception capture mechanism to call these lua APIs and try to catch exceptions.

After embedding Lua in C ++, the problem of correct operation of the extension library written in C ++ is well solved (just build a C ++ version Library separately ), but when you interact in multiple languages and use the medium in C/C ++, this problem is much more complicated. For example, in recent years, Unity3D is widely used to develop games, and Lua is embedded in the mono virtual machine to write game logic, which involves communication between lua mono C. Mono itself also has its own virtual machine. I am afraid it is difficult for you to replace LUAI_THROW LUAI_TRY in lua's own implementation with the exception implementation of mono. Therefore, when you use C # to write a function that can be called by Lua, you should avoid C # Exceptions from being leaked to Lua's VM. In turn, lua exceptions must be intercepted at the lua or C level. Therefore, we do not recommend that you directly map the lua C api into a C # function and use C # to directly operate the lua state, which is hard to write and complete.

Considering that mono itself is implemented by C, the exception propagation of Lua API can work normally in the mono vm in most cases (if you regard mono as a module written by C ), however, when an exception occurs (the Lua program is different from the C program, and in many cases it depends on exception propagation), even if it is captured at the Lua layer, as long as the C # code is crossed in the middle, some side effects are hard to detect. This is because the VM implementation of lua directly uses longjmp as the stack frame unwind of C, and the mono vm cannot perceive it. The danger is working normally in 99% cases, but occasionally it is difficult to find out.

If you use C # to implement it again, Lua can completely solve this problem. UniLua is such a project. The disadvantage of this is that the performance is worrying. After all, C # is much slower than the native code.

If you care about performance, you can still compile Lua into a native library and then export the interface to C #. There are many such projects, so we will not list them one by one. However, you should note that you should avoid operating Lua_State at a low level, and encapsulate several simple high-level interfaces. It is not advisable to directly allow the C # code to read and write the data structure in the Lua State. Almost all C APIs for Lua State have exception handling issues. Simply encapsulate these C APIs into C # functions, either incomplete or inefficient (problems caused by low-level encoding ).

Let's take the interaction between Lua VM and mono VM as two black boxes. In fact, this interaction is essentially no different from that between different processes, machines, and services. Is the problem getting familiar? It is actually the process of sending messages to each other. We only need to talk about message encoding, message transmission, and processing by the other party. Do not overthink the performance overhead in the message passing process. Admit a certain overhead, it can provide greater flexibility and conciseness of the design interface. What we really need to consider is how to minimize the interaction frequency.

In fact, all we need to do is register the C # function to Lua VM according to the uniform specification for its call (even only one interface allows Lua to send messages ), provide a method for C # to call functions in Lua (or send messages to Lua, and the Lua side converts the messages to function calls. This process is actually performed in the same process (or even the same thread. The message encoding is not necessarily a continuous string, as long as it is a memory address that both parties can encode and decode.

Because this blog is exactly what we have encountered in our own project, I wrote a set of sample code for my colleagues while writing the text. The code is on github. It only implements the basic functions and is only a C library. However, through some simple encapsulation, it can be packaged into a C # module for use in the unity3d mono environment.

Ps. the problems mentioned in this article are not only for beginners of lua, but some users may discuss more or less the problems mentioned here when binding lua APIs to libraries outside of C.

Taking luabind, which is widely used by C ++ users, as an example, its "Lua functions in C ++" feature is incomplete. However, the implementation of this C ++ library is extremely complicated. It is difficult to see and understand the problems (design limitations), and the hidden risks are not easy to appear, it is a great threat to users. (Of course, after you know the problem clearly, you can avoid usage that is prone to problems)

Specifically, you want to call a lua function from C ++. Luabind provides a method called call_function, which is easy to use. For more information, see Section 7.3.

Generally, we call it directly from the host program, that is, calling the lua function is not in the lua protection mode. This is taken into account in the implementation of luabind. Therefore, call_function only uses lua_pcall instead of the lua_call that may generate exceptions.

The problem lies in obtaining function objects, processing parameters, and converting return values to C ++ objects.

It is very difficult to understand its implementation, so we only look at the obvious problems:

The subject of call_function is implemented in luabind/detail/call_function.

If you provide a string to locate the global function, you can see in row 445:

Lua_pushstring (L, name );
Lua_gettable (L, LUA_GLOBALSINDEX );

Return proxy_type (L, 1, & detail: pcall, args );

Here, both lua_pushstring and lua_gettable may throw an exception, but they are not protected by pcall (pcall is triggered later ).

Of course, if you do not consider oom errors or the global table may be overloaded with the index meta method, the error may occur. This seems like a small problem.

Ps. If you press the parameter into the lua stack before pcall, the OOM may be a similar problem and is not considered for the time being.

Let's look at a more serious one:

The return value of call_function is converted to a C ++ object by the Ret class specified by the template after pcall is returned.

We can see in row 3 that this process is after m_fun is called, that is, pcall. That is, the conversion from the lua value to the C ++ object is not protected by pcall.

Why is this process highly risky? Because when you convert from lua string to C ++ string, you actually call lua_tostring (for details, see luabind/detail/policy. capp)

In addition to oom exceptions, this api has many potential errors. Because all objects in lua can be appended with the tostring meta method, a piece of lua code is executed when converted to a string. This is very common in lua programs.

The correct encapsulation method should be that when the lua function is called from C ++, the transmission of parameters and the receipt and conversion of return values to the host language should be included in a lua_pcall, use lua_call to call the real lua function. You can correctly capture errors in the lua code throughout the process.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.