(Favorites) Anders hejlsberg talks about C #, Java and C ++ generics

Source: Internet
Author: User
Anders hejlsberg talks about generics in C #, Java and C ++

Original: Bill Venners, Bruce Eckel 2004.2.26
Original: Http://www.artima.com/intv/generics.htmlTranslation: Lover_p
Source: Http://www.cstc.net.cn/docs/docs.php? Id = 258

[Profile]

Anders hejlsberg, A famous Microsoft engineer, led his team to design C # (Reading: C-sharp)ProgramDesign Language. The first time hejlsberg stepped onto the software stage was in the early 1980s S, as he designed the PASCAL Compiler for MS-DOS and CP/M. At that time, Borland, a small company, quickly hired him and bought his compiler, called Turbo Pascal. At Borland, hejlsberg continued to develop Turbo Pascal and eventually led his team to design a replacement for Turbo Pascal: Delphi. In 1996, 13 years after entering Borland, hejlsberg joined Microsoft. Initially, he worked as an architect for Visual J ++ and Windows fundatioin classes (WFC. Subsequently, hejlsberg became the chief designer of C # And a key participant in the. NET Framework. Currently, Anders hejlsberg is still leading the continued development of the C # programming language.

Bruce EckelThink in C ++ (C ++ programming ideology) and think in Java (Java programming ideology.

Bill Venners, The editor of artima.com.

[Content]

    • Generic Overview
    • Generic Type in C #
    • C # comparison between generics and Java generics
    • C # comparison between generic and C ++ templates
    • C # constraints in generics
Generic Overview

Bruce Eckel: Can you give a quick introduction to generics?

Anders hejlsberg: Generics are actually the ability to add type parameters to your type, also known as parameterized types or parameter polymorphism. The most famous example is the list collection class. A list is an array that is easy to grow. It has a sorting method, You Can index it, and so on. Now, if there is no parameterized type, neither array nor list is good. If you use an array, you can obtain a strong type, because you can declare an array of the customer type, but you lose the growth and convenient methods. If you use a list, you can get all the conveniences, but you lose the strong type. It's hard to tell what a list is (type), it's just a list of objects: "What type of list" refers to the type of elements stored in the list ]. This will cause you trouble, because the type can only be checked during runtime, that is, the type check will not be performed during compilation. Even if you want to put a customer in a list and try to get a string from it, the compiler will not be unhappy. You cannot find that it cannot work before running it. At the same time, when you put the simple type [: value type] into the list, you must also pack them. It is precisely because of these problems that you have to wander between lists and arrays. You often have to make a painful decision on which one to use.

The greatness of generics is that you can now enjoy your cake, because you can define a list <t> (read as: List of T: chinese can be called "T-type list "]. When you use list, you can tell what type of list it is, and you will get a strong type, the compiler will check its type for you. These are just intuitive benefits, and there are many other advantages. Of course, you can not only use it for list, hastable, or dictionary (the data structure that copies the key to the value)-all you want to call. You may want to film a string to a customer and an int to an order. In these cases, you can obtain a strong type.

Generic Type in C #

Bill Venners: How does generics work in C?

Anders hejlsberg: In C # Without generics, you can only write class list {...}; in C # with generics, you can write the class list <t> {...}, T is a type parameter. In list <t>, you can use t as a type. When it is actually used to create a list object, you need to write list <int> or list <customer>. In this way, you construct a new type from list <t>. It looks like you have replaced all the type parameters with your type variables. All t is converted to int or customer. You do not need to perform downward conversion. They are strongly typed and will be checked at any time.

When you compile list <t> or other generic types in CLR (Common Language Runtime, they are converted to Il (intermediate language, intermediate language) and metadata like normal types. Il and metadata contain additional information. You can know that this is a type parameter. Of course, in principle, compilation of generic types is the same as that of other types. At runtime, when your application references list <t> for the first time, the system will check whether you have used list <int>. If no, it calls JIT to compile the list <t> with int type variables into Il and metadata. When JIT instantly compiles Il, the type parameter is also replaced.

Bruce Eckel: So it is instantiated at runtime.

Anders hejlsberg: It is indeed instantiated at runtime. It generates specific native only when neededCode(Native code ). Literally, when you say list <t>, you will get a list of int type. If the generic type uses an array of the T type, it will become an array of the int type.

Bruce Eckel: Will this class be collected by the garbage collector at a certain time?

Anders hejlsberg: Yes or no. This is an orthogonal problem. It will create a class in the Assembly, which will always exist in the Assembly. If you terminate the assembly, the class will disappear, just like other classes.

Bruce Eckel: But if I declare a list <int> and a list <cat> in my program, but I have never used a list <cat> ......

Anders hejlsberg:...... The system will not instantiate the list <cat>. Of course, this is not the case below. If you use ngen to generate an image, that is, if you generate an image with native code in advance, it will be instantiated in advance. However, if you run in a general environment, the instantiation is demand-driven and will delay as much as possible, it is not instantiated until it is used ].

Actually, all types to be instantiated are value types, such as list <int>, list <long>, list <double>, and list <float>. We create a unique copy of executable native code. Therefore, list <int> has its own code, list <long> has its own code, and list <float> has its own code. We share their code for all reference types because they are the same in performance and they are just some pointers.

Bruce Eckel: Therefore, you only need to convert.

Anders hejlsberg: No, it is not actually needed. We can share native images, but they actually have independent vtables. I want to point out that we only try to share the code as meaningful as possible, but we know that many codes cannot be shared for efficiency. The typical type is the value type. You will be very concerned about whether the list <int> is an int. You certainly do not want to pack them as objects. Packing value types is a shared method, but the overhead of packing them is very high.

Bill Venners: For the reference type, the difference is only the class. List <elephant> is different from list <orangutan>, but they actually share the code of all methods.

Anders hejlsberg: Yes. As implementation details, they actually share the same native code.

C # comparison between generics and Java generics

Bruce Eckel: How to compare the generics in C # with those in Java?

Adners hejlsberg: Java generics were initially based on a project called pizza jointly developed by Martin odersky and others. Pizza was renamed Gj and then JSR, and finally ended in being accepted by the Java language. This generic model is designed to meet the key objectives of operating on the original Virtual Machine (VM. That is to say, you don't have to modify your Vm, But it imposes many restrictions. These restrictions will not appear soon, but soon you will say, "Well, this is a little strange ."

For example, if you use Java generics, I don't think you will actually get any execution efficiency, because when you compile a Java generic class, the compiler will replace all the type parameters with objects. Of course, if you try to create a list <int>, You need to pack all the int values. Therefore, this will incur a lot of overhead. In addition, to make the VM happy, the compiler must insert type conversion for all types. If a list is object, and you want to treat these objects as customer, you must convert the object to customer to satisfy the type checker. When implementing this, it just inserts all these types for you. Therefore, you just tasted the sweetness of syntax, but did not get any execution efficiency. So I think this is the number one problem in Java implementation.

Question 2, I think, is also a very serious problem. This is because Java generics are implemented by eliminating all type parameters, you won't be able to get the same reliable performance at runtime as during compilation. When you reflect a generic list in Java, you cannot know what type of list this is. It is just a list. Because you lose the type information, any dynamic types generated by code generation schemes or reflection-based schemes cannot work. The only trend that makes me think clearly is that more and more things cannot run, because you lose the type information. However, in our implementation, all this information is available. You can use reflection to obtain the system. Type of the List <t> object. But you still cannot create an instance of it, because you do not know what t is. However, you can use reflection to obtain the sytem. Type of the int. Then you can combine these two system. Types and create a list <int>. Then you can obtain another system. Type of list <int>. Therefore, all you can do during compilation can also be done at runtime.

C # comparison between generic and C ++ templates

Bruce Eckel: How to compare C # Generic and C ++ templates?

Anders hejlsberg: I think the best difference between C # Generic and C ++ templates is that C # Generic is more like a class, but it has a type parameter; the C ++ template is close to a macro, but it looks like a class.

The biggest difference between C # Generic and C ++ templates is the timing of Type checks and how to instantiate them. First, C # Is instantiated at runtime. C ++ is instantiated during compilation or connection. In any case, C ++ is instantiated before running the program. This is the first difference. The second difference is that when you compile a generic type, C # performs a strong type check. For a non-constrained type parameter, such as list <t>, the method that can be executed on the value of type T is only the methods that can be found in the object type, because only these methods can be ensured. In C #, we must ensure that all operations executed on a type parameter are successful.

C ++ is the opposite. In C ++, you can perform any operation you want on the variable of the type specified by the type parameter. But once you instantiate it, it may not work, and you will get some ambiguous error messages. For example, if you have a type parameter T, and X and Y are T-type variables, then you execute X + Y. If you have defined an operator + for two T types, otherwise, you can only get meaningless error messages. Therefore, in a sense, the C ++ template is actually non-typed, or weak. C # generics are strongly typed.

C # constraints in generics

Bruce Eckel: How does a constraint work in C # generics?

Anders hejlsberg: In C # generics, we can apply constraints to type parameters. Taking our list <t> as an example, you can say that the class list <t> where T: icomparable. This means that T must implement the icomparable interface.

Bruce Eckel: Interesting. In C ++, the constraints are implicit.

Anders hejlsberg: Yes. We can also do this in C. For example, we have a dictionary <K, V>, which has an add () method. This method has K key and V value parameters. The implementation of the add () method will hope to compare the passed key with the existing key in the dictionary, and it wants to use an interface called icomparable. The only way is to convert the key parameter to the icomparable interface and then call the compareto method. Of course, when you do this, you create an implicit constraint for the K type and key parameter. If the passed key does not implement the icomparable interface, you will get a runtime error. This may occur in all your methods, because your conventions do not require keys to implement the icomparable interface. Of course, you have to pay for the runtime type check because you actually perform dynamic type conversion.

With constraints, you can eliminate the dynamic check in the Code and perform it during compilation or loading. Many things will happen when you require K to implement the icomparable interface. For values of the K type, you can now directly access the interface method without type conversion. Because the program can implement this interface in terms of semantics. No matter when you try to create an instance of this type, the compiler will check whether these types have implemented this interface. If this interface is not implemented, it will give you a compilation error. If you are using reflection, you will get an exception.

Bruce Eckel: Do you mean that the compiler and Runtime (will all be checked )?

Anders hejlsberg: The compiler will check it, but you may still do this through reflection at runtime, so the system will check it again. As I mentioned earlier, everything that can be done during compilation can be done through reflection during running.

Bruce Eckel: Can I create a function template? In other words, a function with parameters of unknown types? You added a strong type check for the constraint, but can I get a weak type template like the C ++ template? For example, can I write a function with two parameters a A and B and write a + B in the code? Can I say that I don't care if operator + exists for a and B, because they are weak?

Anders hejlsberg: The question you really want to ask is how to address this constraint? Constraints, like other features, can eventually be any complicated. When you consider it, the constraint is only a pattern matching mechanism. You may want to say "This type of parameter must have a constructor with two parameters, implement operator +, have this static method, have those two instance methods, and so on ". The problem is, how complicated do you want this pattern matching mechanism to be?

From nothing to full pattern matching is an entire continuous body. Nothing (pattern matching) is too small to illustrate the problem; and full pattern matching is too complicated, so we need to find a balance point in the middle. We allow you to specify constraints as a class, one or more interfaces, and some constructor constraints. For example, you can say: "This type must implement ifoo and Ibar" or "this type must inherit the base class X ". Once you do this, you will check whether the constraint is true at compile time and runtime. Any method implied by this constraint is effective directly for the type value specified by the type parameter.

In C #, operators are static members. Therefore, an operator cannot be a member of an interface. Therefore, an interface constraint cannot be brought to operator +. You can only get operator + through class constraints. You can say that this type of parameter must inherit from, for example, the number class, and the number class has operator + for two nubmer. However, you cannot abstract the meaning of this sentence by saying "you must have an operator +.

Bill Venners: You are constrained by the Type instead of the signature.

Anders hejlsberg: Yes.

Bill Venners: Therefore, this type must be extended to a class or implement an interface.

Anders hejlsberg: Yes. And we can go further. In fact, we also thought about going farther, but this would be quite complicated. In addition, the increased complexity is not worthwhile compared with the obtained complexity. If you want to do something that is not directly supported by the constraints system, you can use a factory model. For example, you have a martix <t>, and in this martix (matrix), you may want to define a "dot multiplication, the other method is called "Cross multiplication. This means that you will eventually consider how to multiply the two T, but you cannot say this as a constraint, at least when t is not int, double, or float. However, you can make your martix carry a calculator <t> as the parameter, while in calculator <t>, there is a method called multiply. You can implement it and pass the result to martix.

Bruce Eckel: Calculator is also a parameterized type.

Anders hejlsberg: Yes. This is something like a factory model, and there are still many ways to do it. This may not be your favorite method, but you have to pay for everything.

Bruce Eckel: Yes. I started to think that the C ++ template is a weak type mechanism. When you want to add constraints, you move from weak type to strong type. But this will certainly bring more complexity. This is the price.

Anders hejlsberg : you can think of the type as a ruler. The higher the standard, the worse the programmer's life, but the higher security comes along. But you can adjust this ruler in any direction.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.