Advanced languages such as C # And Haskell have a core library that is inseparable from syntax. For example, the int of C # Is System. SInt32 in mscorlib. dll, and Haskell (x: xs) is defined in prelude. The ManagedX language of Vczh Library ++ 3.0 also has something similar to mscorlib. dll. Previously, NativeX provided a core function library called System. CoreNative (syscrnat. assembly). Therefore, ManagedX is named System. CoreManaged (syscrman. assembly ). The pre-defined objects in System. CoreManaged are basic and indispensable types, such as System. SInt32, System. IEnumerable <T>, and System. Reflection. Type. Last night, my unfinished semantic analyzer was enough to fully analyze the hosted code in System. CoreManaged. Therefore, the Type System in the symbol table is basically a complete type System. The experience gained in the development process is the source of an article.
Nowadays, advanced object-oriented language types are inseparable from the following features: object type, function type, and interface type. Modification tools include generic and delayed binding. For example, in Objective C #, the object type is object, and the function type is well supported by. net framework, but not the Func and Action of the core type. The interface type is similar to IEnumerable. Generics are familiar to everyone, and delayed binding is similar to the dynamic keyword. The var keyword is bound during the compilation period, so it is not included. Java int is a magic type, and its design error has seriously affected the beauty of the class library, the generic system using "type erasure" also leaves some root cause for future development. Therefore, this article will not go into detail. This article discusses the three important types and two modifiers and explains how they are converted to each other.
In C #, the function type is also part of the object type, but since C # can deduce an incomplete function type into a complete function type during compilation, therefore, it is distinguished from the object type here. Haskell is more thorough in derivation, which is an essential feature of advanced, typed languages. Since the mutual conversion between types is the content of this article, the following are some definitions. Of course, these definitions are not rigorous in mathematics, and I am not pursuing this. Namespace is not very important here, because the difference between namespace and non-namespace is that how an object is interpreted (Resolving in the dark) does not affect the derivation process.
We can name a type T without generics. Generally, because the type has a member function, the type has several basic attributes, called the this type and the base type (in C #, the keywords represent this and base respectively ). This indicates the type of the member function of type T. The base type is the type of the parent class. It is necessary to explain it here. Only the object type has the base type, and its base type refers to the only one in all the parent classes that is not the interface type. Both the function type and interface type have the this type.
Therefore, for any type T that has the following description:
Class T: U, I1, I2, I3 {}
This (T) = T
Base (T) = U
Now let's take A look at the relationship between A type declaration T [U, V] with generics and Its instantiated type T <A, B>. We know that a type declaration T [U, V] with generics is actually an incomplete type, because this type also has two parameters U and V to be filled in, as shown in the following code:
Class T <U, V> {}
After you instantiate it and make U = A, V = B, T type is instantiated into T <A, B> by a and B. This is a bit like converting a Dictionary [K, V] to a Dictionary <int, string>. An instantiated type can be used as a type parameter of another generic type, or it can be used to define symbols or create an instance. However, incomplete generic types T [U, V] and their instantiation types T <A, B> all share the same attributes-this type and base type. According to the above definition, this type is the type seen by the member functions of this type.
Therefore, for any type T [U, V] described below:
Class T <U, V >:w <U, V> {}
This (T [U, V]) = T <U, V>
Base (T [U, V]) = W <U, V>
Of course, for T <A, B>, it also has the T <A, B> and base types W <A, B>. Generally, a non-generic T statement can be processed as T []. We make T [] equal to T <>, all generic type rules can be instantiated to a generic type with 0 generic parameters -- that is, non-generic type. Therefore, the following discussions will not be distinguished.
Now let's consider how to obtain the types of all members of a generic type. We consider the following set of types:
Interface IEnumerable <T>
{
IEnumerator <T> GetEnumerator ();
}
Class Base <T>: IEnumerable <T>
{
Public T Value {get; set ;}
}
Class Derived <T, U>: Base <U>
{
}
Let's consider one question: how do we know the return value type of the GetEnumerator function of Derived <int, string>? At first glance, it seems very simple. In fact, for humans, this is indeed a question that can be instantly answered by intuition without any obstacles. Here I have always admired that nature can make humanity so awesome. However, this problem has plagued me for a long time, mainly when I was developing a semantic analyzer and arranging various types of operations, the structure of the symbol table, and other related issues, this problem becomes more difficult.
However, I don't want to talk much nonsense here. We just need to add several attributes and operation rules to the type pair, so that we can easily combine this problem into an expression.
First, we need a replace operation. The replace operation is difficult to define strictly at once, but an intuitive definition can be provided:
Replace (Derived <T, U >,{ T => int, U => string}) = Derived <int, string>
I believe you can understand it easily. Therefore, for a type ing tm = {T => string}, replace (Derived <IEnumerable <T>, tk) the result is Derived <IEnumerable <string>.
Next, we need a decl operation, which returns the definition of a generic instance type:
Decl (T <A, B>) = T [U, V]
Then, we need a params operation. This operation compares a generic instance type with its generic definition, and extracts the type ing from the generic definition replace to the instance type:
Params (T <A, B>) =={ T => A, U => B}
Therefore, we generally have the following rules. As long as type T is a generic instance type, there are always:
Replace (this (decl (T), params (T) = T
Now we can begin to answer the question mentioned above.
First, for the Type Derived <int, string>, we need to find its parent class. Therefore, we can perform the following steps:
Tm = params (Derived <int, string>) = {T => int, U => string}
Tb = base (decl (Derived <int, string>) = base (Derived [T, U]) = Base <U>
Result = replace (tb, tm) = replace (Base <U >,{ T => int, U => string}) = Base <string>
In this way, the parent class B = replace (base (decl (T), params (T) = Base <string> of T = Derived <int, string> is successfully obtained.
Second, we specify the Base [T] => IEnumerable <T> interface inherited by the Base <string> computing type. We can use
Tm = params (Base <string>) = {T => string}
Result = replace (IEnumerable <T>, tm) = IEnumerable <string>
Therefore, for an interface Id inherited by a generic declaration decl (T), the interface It corresponding to the instance type T of the generic declaration D is equal to replace (Td, params (T )).
Therefore, for IEnumerable [T] function GetEnumerator's Return Value Type IEnumerator <T>, smart readers must think that the corresponding type of IEnumerable <string> is replace (IEnumerator <T>, params (IEnumerable <string>) = IEnumerator <string>. This result is the same as the method of the interface type inherited by the actual type.
We can know that in the calculation of the member types of the generic instance type, we are constantly calculating the results of replace (A, params (B. Therefore, in the Code of the symbol table of the generic object-oriented hosting language: Vczh Library ++ 3.0 ManagedX language semantic analyzer, six functions, including this, base, decl, params, replace, and replace_by_type = replace (A, params (B), are actually implemented using C ++. In C ++, a type instance can only be expressed as a pointer to an object with a complex structure. Therefore, as long as the symbol table saves all generated types and creates indexes during the computing process, and if the condition that "as long as type A and type B are of the same type, their pointers P (A) and P (B) are consistent, the computing speed of the type system will be improved directly.
As for the function type deduction rule (mainly used in the lambda expression abbreviation syntax), I will write the subsequent articles when I develop it. System. CoreManaged was lucky not to use lambda expressions to bring my first milestone ahead of schedule.
Author "λ-calculus"