After completing the Virtual Machine and compiler, we made a comparison with lua5. The comparison results are quite frustrating. My virtual machine can only reach Lua's speed of more than half. So I read a piece of lua5 source code very unconvinced. I used to read the Lua source code for a while, and even Lua 4 and lua5 were read in different periods. Of course, I also know the huge difference.
In fact, for simple programs, my virtual machine has a speed advantage and is much faster than Lua. I blame it on the coding technique. However, the design team is defeated. Because Lua 5 was already a register-based virtual machine, and I am still using a stack-based virtual machine. Although I have improved it and moved closer to the register-based direction, it is not as good as a pure register virtual machine.
I have carefully considered the implementation difficulty of the register-based Virtual Machine in recent days. Although I can continue to work on it myself, but the current project is very tight and I decided to take it for a while. Let's take a look at it.
On vacation today, I read some Lua maillist subscribed to in the mailbox and want to see what changes have been made in Lua 5.1. Because I have done some similar work recently, it is easier to understand the progress of Lua. Finally, we found that the improvement of Lua is concentrated on GC. This is quite satisfactory to me. In fact, the most proud thing I have achieved is GC, which is definitely better than Lua. Lua is currently working on generational GC, and I am considering parallel solutions. It is about to solve some headaches of GC. However, the GC of Lua is not compiled by memory. In a restricted memory environment, the GC performance is not particularly good. In addition, Lua has some problems in deep-level Recursive support to maximize the speed and lead to full stack. I have written a program that uses recursion to calculate the number of sequences and has a deep hierarchy of recursion, so Lua cannot handle it. My virtual machine is competent. Of course, this is not a problem for general applications.
After all, Lua has been developing for more than 10 years and there are too many things to learn from. The three weeks of toys in my zone should start with hands-on training. I read the implementation of Lua 5.0 carefully today. If you are interested in the implementation of the script language, we recommend that you read it.
From this paper, we can learn a few things. Although we can also learn through reading the source code (I just read the source code first), reading this can be more casual.
Lua5 is already a fully register-based virtual machine. It can be said that it is the world's first widely used register-based Script virtual machine. In this case, both Java JVM and. NET are stack-based. Perl6 is said to be a register-based program. Of course, my language also has this plan. Registere-based virtual machines are relatively difficult to write, which seems to be a challenge.
Lua5 optimizes the table implementation, which I mentioned in the previous blog. This paper is also described in detail.
Lua5 has added coroutine implementation. I have considered that since I changed my virtual machine to a system stack independent, it is not difficult to create more than one stack by running several threads. Coroutine of lua5 also has an independent stack for each thread. (Actually there are two stacks)
Lua is also a one-pass compiler based on the recursive descent algorithm. As mentioned in paper, the compiler of one-pass is hard-written, which I have a deep understanding. After writing compiler for a few days, my mind went on to scream, and I did not dare to submit a line of code to the warehouse for two days. However, the advantages of writing are obvious, just as mentioned in paper, smaller, more efficient, more portable, and fully reentrant.
A few days ago, I was wondering whether the object description in a weak language could be smaller than that in Lua. Lua uses an independent word to describe the type, and then uses union to store different types of data. This is also my method. I tried a method a few days ago, that is, the last two digits of the pointer will always be 0 due to alignment. (If we use an alignment of 8, the last three digits are all 0.) in this case, some additional information is used to save the data type. An integer can be expressed as 30 or 31bit.
Today, I read this paper and found that our predecessors have already used it, for example, smalltalk80. But paper explains why Lua does not do this. Of course, I can try my own language.
As for the instruction set used by lua5, each instruction uses only four bytes. The implementation method is still very clever. It has been learned by reading the source code a few days ago. At that time, I wanted to find out where the speed of my language was slow. I = I + 1 was in a loop. Register-based virtual machines can naturally run faster. In addition, the instruction of my virtual machine is 8 bytes, while Lua only uses 4 bytes. The data is short and the efficiency can be improved.
After reading this paper, I feel that Lua is despising Python's inefficiency in many places. This is the same as the Lua Round Table I attended on gdc2004, now we can develop Lua and Python's "faction struggle". I think the Lua faction on gdc2006 will be more confident. After all, lua4 users were far inferior to lua5 users two years ago, (When I spoke, I asked lua5 users to give a hand. This time, with the release of Lua 5.1, Lua's position in the gaming industry will be unshakable.