This is a research note, mainly to consult with you. In addition to the beginning, there is no extra nonsense, and no other customer experience is required. Please do not complain about poor readability.
1. Add a colon Before a name or string to get a symbol object. You can also use string # to_sym, fixnum # to_sym, and string # intern.
2. Generally, the key for hash using symbol is called to save memory and improve execution efficiency.
3. Why can I save memory? String in ruby is a mutable object, which is different from Java, C #, and python. Note that it is different from the basic_string <t> of cow in some C ++ standard libraries. Every string in Ruby can be changed locally. This is probably because two strings with the same content in ruby are actually two different objects.
A = "hello"
B = "hello"
Although both strings have the same content, you know a. object_id when comparing a and B! = B. object_id, which points to not the same object. The result is similar to the C language without string pooling optimization. Whether immutable is good or mutable is good, or whether it looks like a Smart cow is good. However, Ruby is designed to use strings as hash keys. For example, if you write:
H ["Ruby"]. Name = "Ruby"
H ["Ruby"]. Author = "Matz"
H ["Ruby"]. birth_year = 1995
The "Ruby" string is dynamically generated three times, consuming three times of memory. This seriously wastes memory. Ruby is used as the key, because during the entire running process, Ruby runtime ensures that there is only one symbol object named Ruby, so there is no need to generate three, saving memory.
4. Why can the execution efficiency be improved? The obvious reason is that the 'Ruby 'string is not dynamically generated multiple times. Not only that, but the key value of hash should be a constant. Therefore, Ruby's hash must be protected against string objects that serve as keys. The so-called protection means that string is frozen, otherwise, you may change the value. Of course, there is a cost for protection, and symbol does not require protection. Of course, it can improve efficiency. Note that other mutable objects can also be used as hash keys, which is a strange place in Ruby design. Run the following code in IRB and you will find that Ruby's hash value is lost.
H = hash. New
L = [1, 2]
H [l] = "a big object! "
L <3 # actually can be changed!
H [l] # ==> nil. It seems normal.
# However
H [[1, 2] # ==> nil, still not found
# Check keys
H. Keys # ==>{ [1, 2, 3]} seems to be still in it
H [[1, 2, 3] # => Nil
# However
H #=>{ [1, 2, 3] => 'a big object'}. It is not found here.
H. rehash # => in this way, everything will return to normal.
At this point, the python design is easier to understand. The list is simply unhashable and cannot be used as the hash key.
Let's look back at how to improve efficiency. The third reason for the increase in symbol efficiency is that symbol is essentially not much more than an integer. You can use symbol # to_ I to obtain a unique integer in the entire program. Hash can use this integer to generate the hash value. Isn't it much faster than calculating the hash value based on the string content? This is a small note. Since this integer is unique, a unique hash value is generated, that is, a piece of cake. If we can ensure that the hash value is unique, what is the hash table, it simply becomes an array. Hash Tables may also conflict with each other, and arrays do not conflict at all. Guarantees O (1), of course, fast. I didn't see the ruby source code. I don't know if this is the case.
5. Why does Ruby runtime ensure that every symbol is unique? Because Ruby stores symbol in a symbol table maintained during runtime, and this symbol table is actually an atom data structure, which stores all the current program-level names, make sure that multiple objects with the same content are not displayed. Almost every language and system has such a symbol table, except for a language like C/C ++. This symbol table only exists during compilation and does not exist during runtime. While Python and Ruby keep this table for backup at runtime. Why not have such a ready-made data structure?
6. However, this table does not only store the symbols generated by ourselves, but also contains all the names of the current program after the Ruby interpreter performs lexical analysis and syntax analysis on the current program. This is what the ruby engine uses. We only need to add a colon to make our objects and the objects used inside the ruby engine become neighbors. Therefore, the string # intern method is called intern (internalization ).
In the. NET Framework, the string class also has an intern method, which means the same. It is translated as "resident" in Li Jianzhong's classic translation ".
7. You can use symbol # all_symbols to view all the currently defined symbrs. You can experience the feeling of inserting an object into the symbol table. It should be nice to think that the program you write can do the same thing as the ruby engine.
8. This is not needed in Python, because the string is immutable. It's useless to put it down. Is there a way to intern in Python? I have not found a solution yet. Does anyone know about Python?
I have found that the function used in Python to do this is intern ().
9. I think this ruby design is simplified from the Perl glob. In Perl, you can use * A to obtain glob corresponding to symbol A, which is a monster like octopus. Ruby can also easily get the objects in the symbol table, but it does not design the symbol as an octopus.
10. I still haven't figured out some small issues, such as: What is the relationship between name and @ name. Attr_reader: Name: in fact, it is used to pass a symbol to the attr_reader method as the parameter. The former needs to find the @ name variable through this symbol, isn't it as simple as '@' +: Name. id2name? You can probably check the source.