The string in the Ruby language is mutable, unlike a string in Java, C #, which is immutable. Like what
Copy Code code as follows:
In Java, for literal strings, a table is maintained inside the JVM, so if in Java, str1 and str2 are the same string objects. In Ruby, str1 and str2 are completely different objects. Similarly, the operation of a string object in Java produces a new object, while Ruby manipulates the same object, such as:
Copy Code code as follows:
Str= "ABC"
Str.concat ("CDF")
At this point str is "ABCCDF". What does Ruby do with string? We only talk about the implementation of C Ruby, interested in first look at this article "a glimpse of ruby--object base." In Ruby.h, we can see the structure of the string object, and the objects in Ruby (including the class as well as objects) are one of the struct,string and no exception:
Copy Code code as follows:
struct Rstring {
struct Rbasic basic;
Long Len;
Char *ptr;
Union {
Long Capa;
VALUE shared;
} aux;
};
Ruby.h
Obviously Len is the length of a string; PTR is a pointer to a char type, pointing to the actual string; then a union, which we'll talk about later. If you look at Ruby.h you can see that almost all of the defined object structures have a struct rbasic. Obviously, struct rbasic contains some important information that is shared by all object structures. Look at Rbasic:
Copy Code code as follows:
struct Rbasic {
unsigned long flags;
VALUE Klass;
};
The flags are a multi-purpose tag, most of which are used to record the type of the struct, ruby.h some of the column's macros, such as t_string (representing struct rstring), T_array (representing struct rarray), and so on. Klass is a value type, and value is also unsigned long, which can be used as a pointer (a pointer of 4 bytes, more than enough), which points to a Ruby object, which is further deepened here.
So what are the CAPA and shared in the Union aux? Because Ruby's string is variable, and the variable means Len can change, we need to increase and decrease the memory each time according to Len's transformation (using the ReAlloc () function in C), which is obviously a big overhead, and the solution is to reserve some space, PTR points to a slightly larger memory size than Len, which does not require frequent calls to ReAlloc, Aux.capa is a length that contains additional memory size. So what is aux.shared for? This is a value type, indicating that it is pointing to an object. Aux.shared is actually used to speed up the creation of strings in a loop:
Ruby Code
Whiletruedo repeats a= "str" #以 "str" creates a string for the content, assigns it to a a.concat ("ing") #为a所指向的对象添加 "ing" P (a) #显示 "string" End
Each time you re-create a "str" object, the interior is a repeat creation of a char[], which is quite extravagant, aux.shared is used to share char[], the literal created string will share a char[], and when you want to change, copy the string into a unshared memory , the change is directed to this new copy, which is called "copy-on-write" technology. Explains the internal structure of string, which does not seem to explain how the string implements Mutable, and we write a ruby extension test, and we want to write a Ruby class like this:
Ruby Code
Classtestdefteststr= "str" str.concat ("ing") endend
The corresponding C language code is:
CPP Code
Copy Code code as follows:
#include
#include "Ruby.h" Staticvaluet_test (valueself) {
Valuestr;str=rb_str_new2 ("str");
printf ("beforeconcat:str:%p,
str.aux.shared:%p,str.ptr:%s "n", str, (rstring (str)->aux). shared,rstring (str)->ptr);
RB_STR_CAT2 (str, "ing");
printf ("afterconcat:str:%p,str.aux.shared:%p,str.ptr:%s" n ",
STR, (rstring (str)->aux). shared,rstring (str)->ptr); returnself;
}
Valuectest;
Voidinit_string_hack () {
Ctest=rb_define_class ("Test", rb_cobject);
Rb_define_method (CTest, "test", t_test,0);
}//string_hack.c
The Rb_define_class function defines a class Test,rb_define_method adds the T_test method to the test class with the name of test. In T_test, a rstring structure is generated by rb_str_new2 each time, and then the STR is connected to "ing" by rb_str_cat2, and some printing is added for tracking. Use MKMF to produce makefile, write a extconf.rb
Ruby Code
Require ' MKMF ' Create_makefile ("String_hack");
Executing ruby EXTCONF.RB produces a makefile, executes make, and generates a string_hack.so link library. The extension is finished and invoked through Ruby:
Ruby Code
Require ' String_hack ' t=test.new (1..3). Each{|i|t.test}
Output:
Before Concat:str:0x40098a40, str.aux.shared:0x3, Str.ptr:str
After Concat:str:0x40098a40, str.aux.shared:0x8, str.ptr:string
Before CONCAT:STR:0X40098A2C, str.aux.shared:0x3, Str.ptr:str
After concat:str:0x40098a2c, str.aux.shared:0x8, str.ptr:string
Before Concat:str:0x40098a18, str.aux.shared:0x3, Str.ptr:str
After Concat:str:0x40098a18, str.aux.shared:0x8, str.ptr:string
As can be seen from the results, after the STR concat, the location of the STR is not changed, only the value of the string that the PTR points to in Str is changed, and the realization of the RB_STR_CAT2 function is clear at a glance:
CPP Code
Copy Code code as follows:
Valuerb_str_cat (Str,ptr,len) valuestr;
Constchar*ptr;
Longlen;
{
if (len<0) {rb_raise (Rb_eargerror, Negativestringsize (Orsizetoobig));
}
if (Fl_test (STR,STR_ASSOC))
{
rb_str_modify (str);
Realloc_n (rstring (str)->ptr,char,rstring (str)->len+len);
memcpy (rstring (str)->ptr+rstring (str)->len,ptr,len);
rstring (str)->len+=len;
rstring (str)->ptr[rstring (str)->len]= ' 0 ';
/*sentinel*/
Returnstr;
}
Returnrb_str_buf_cat (Str,ptr,len);
}
Valuerb_str_cat2 (str,ptr) valuestr;
Constchar*ptr;
{
Returnrb_str_cat (Str,ptr,strlen (PTR));
}
//string.c