How to design a language (2) -- What is pitfall (B)

Last Update:2018-12-08 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

I have never seen such a fan in other languages that easily shows the ugly side of human nature, even the C ++ and C attacks that started more than a decade ago, when GC and non-GC are sprayed, static type dynamic type is sprayed, and even when the cloud wind is sprayed with C ++, the brain is not affected. This happens only to the brains of the go language. What exactly does this mean? People who want to learn the go language should be careful. It doesn't matter how to use go. It doesn't matter if go can't afford to jump to another language, even if shake M is very fond of being tossed, it doesn't matter if you stick to go, but it's not good if you have learned yourself as a brainless fan and your mind has an irreversible change.

Of course, the last example in the previous article is that I haven't made it clear, so some people have the illusion that you can add a virtual destructor. Base * base = new Derived; it's okay to delete it later, because you can declare the Destructor as virtual. However, Base * base = new Derived [10]; then you go to delete []. The problem occurs because the length of Derived is different from that of Base, so when you start to calculate & base [1], you actually get a location in the middle of the first Derived object, not the second Derived. At this time, you do a variety of operations on the above (such as calling the destructor), you can't even get the correct this pointer, And it is useless to empty it. However, if VC ++ simply performs delete [], there will be no problem in this case. I guess it not only records the length of the array, but also records the size of each element. Of course, when you directly use bases [1]-> DoSomething (), an accident is necessary.

So when the fans today discuss the example of yesterday, one of us, chicory, said a sentence:

When you use C ++, it is best to minimize the impact of a subset of C.

I also agree. C ++ has various built-in types. For example, if a typeid is used to generate an item (I forgot), initialization_list, range, or something. Why not create a type for new T [x? But it is already a reality. If you have nothing to do, use vector and shared_ptr. Don't think about new and delete yourself.

Today we are talking about a little bit of "advanced. This is a practical example I encountered after work. Of course, language traps are all there, and people must jump into the traps because they do not know enough things. But there are three pitfalls. The first one is obvious, as long as you follow some seemingly stupid but effective principles (such as when if (1 = )...) This can be removed. The second kind of pitfall is because you do not know some advanced knowledge (such as the lifecycle of lambda and variables. The third is purely due to lack of foresight-for example, the following example.

One morning in the light of spring, I received a new task to write an image processing pipeline with another person not in our group. What is this pipeline Node? What is the histogram? convolution? Gray Scale? What is the edge. So at the first day of the meeting, I got a spec and wrote the C ++ interface they designed but haven't started writing (yes, that is the kind of genre that uses the interface even if there is only one implementation.) Let me go back and have a look at it and implement it with them in a few days. Of course, there must be a matrix in these interfaces:

template<typename T>class IMatrix{public:    virtual ~IMatrix(){}    virtual T* GetData()=0;    virtual int GetRows()=0;    virtual int GetColumns()=0;    virtual int GetStride()=0;    virtual T Get(int r, int c)=0;    virtual void Set(int r, int c, T t)=0;};

To be honest, there is no big problem in writing IMatrix. So we had a good time working for a few days, and then completed these purely mathematical-related algorithms, and then began to perform convolution. In fact, the heap of numbers required by convolution is not a matrix, but it is meaningless to create a class for this kind of thing, so we use the matrix with the same number of columns as filter. The interface at the beginning is defined as this. Because IBitmap may have different storage methods, only the IBitmap implementation knows how to perform Convolution:

template<typename TChannel>class IBitmap{......    virtual void Apply(IMatrix<float>& filter)=0;......};

So we spent a few days happily until someone jumped out one day and said, "The filter cannot be modified in Apply. Why didn't we create a const for him ?" So he showed us his modified interface:

template<typename TChannel>class IBitmap{......    virtual void Apply(IMatrix<const float>& filter)=0;......};

I vaguely remember that my expression was like this.

The Type System of a language is particularly complicated, especially for C ++, const T <a, B, c> and T <const a, const B, cont c> is two different types. The language is different from that of the beautiful theory, but it is not just serious. Of course, the above is not a big problem, because it is really written according to this interface, and it will end up because it cannot create the implementation of IMatrix <const float>.

The reason is very simple, because in general, the implementation of IMatrix <T> contains an array represented by T. At this time, you are replaced with const float, and you will find that your Set function cannot write const float into const float *, and then it will be suspended. The correct method is:

virtual void Apply(const IMatrix<float>& filter)=0;

However, before proceeding to this issue, let's look at a more easy-to-understand "pitfall", which is about the value type of C. For example, one day we need to create a super-high-performance particle motion simulation program containing four mechanics-Keke-in short, starting from a Point type. It was written in the beginning (C #5.0 ):

struct Point{    public int x;    public int y;}var ps = new Point[] { new Point { x = 1, y = 2 } };ps[0].x = 3;

It has started to work well, and nothing has happened, and the Point in ps [0] has also been well changed. But one day, the situation changed, and new particles will be generated and eliminated between particles, so I changed the array to List:

var ps = new List<Point> { new Point { x = 1, y = 2 } };ps[0].x = 3;

The result compiler tells me that the last row has an error:

Cannot modify the return value of 'System.Collections.Generic.List<ArrayTest2.Program.Point>.this[int]' because it is not a variable

C # This language is awesome. After using it for so long, I only find out this "inconspicuous problem" and it is still a compilation error, so when I use C #, there is no way to use it wrong. But think about it. VB has been used by so many people in the past, except for the on error resume next. It can be seen that Microsoft's design language is more powerful than a dog company.

So I was very confused at the time. I wrote another class to verify this problem:

class PointBox{    public int Number { get; set; }    public Point Point { get; set; }}var box = new PointBox() { Number = 1, Point = new Point { x = 1, y = 2 } };box.Number += 3;box.Point.x = 5;

The last line of the result is over, and the last line is incorrect. Why is int + = 3 and Point cannot be changed to a field, so a new field must be created and then copied in? Later, we can only draw a conclusion that the array can be List or not, and the attribute can be + = and cannot be changed to field (you define an operator + for the Point, then you have. point + = is also acceptable), can only think that the language is deliberately designed.

Here, I think of a sentence I have read on MSDN, saying a structure. If it exceeds 16 bytes, it is recommended that you do not make it into struct. In addition, a small sample written by Lao Zhao also proves that struct is not as fast as class in most cases. Of course, why am I not going to detail it here? Let's talk about the syntax.

In C #, the difference between struct and class is the difference between value and reference. C # Special Value Type and reference type. The value type cannot be converted to reference (unless the box is an object, nullable, or lazy), and the reference type cannot be converted to value type. The value cannot be inherited, and the reference can be inherited. We all know that one of your classes inherits from the other, and the purpose is to overwrite several virtual functions. If you do not want to overwrite the virtual function and then inherit it, you may have a problem with your idea. If you inherit the class, you can implicitly convert it from the reference of the subclass to the reference of the parent class, and then meet the requirements of the Rys replacement principle.

But the struct of C # is a value type, that is, it is not a reference (pointer), so there is no such thing as getting the parent class reference. Since each time you see a real type (unlike a class, you get IEnumerable <T>, it may be a List <T> ), there is no need for virtual functions. If you cannot write virtual functions in struct, what should you do with inheritance? Therefore, struct cannot be inherited.

Then let's take a look at the C # attributes. In fact, operator [] of C # is not an operator. Unlike C ++, it is regarded as an attribute. The attribute is actually a syntactic sugar. The getter and setter functions are two functions. Therefore, if the type of an attribute is struct, the return value of getter is also struct. What does a function mean by returning struct? Of course, the result is [copied] And then returned. So when we write box. Point. x = 5, it is equivalent to box. get_Point (). x = 5. The Point you get is copied. You modify the x in a copied struct, which naturally does not affect the Point stored in the box. Therefore, this is an invalid statement. C # Just sets a compilation error for you. However, you may ask, the List and Array are all operator [] and are also an attribute. Why can Array be used? The answer is simple. Array has special care ......

However, why do few people encounter this problem? It must be something that can be written into struct. As a whole, it is a state. For example, although the above Point, x and y are separated, they do not represent the State independently, indicating that the State is the whole Point. Tuple (this is a class, but it is similar to struct). There are also many other struct definitions in. net framework. Therefore, even if we often construct a List <Point>, we seldom need to modify part of an element separately.

So why doesn't struct simply make every field unmodifiable? The reason is that this does not bring any benefit at all. If you misoperate it, there will always be compilation errors. Some may ask why this operation is affected in the methods in struct? This is a good question, because this is essentially a "Pointer.

This is different from what was mentioned in the previous article. The two pitfalls in this article are actually not difficult, because they will eventually cause compilation errors to force you to modify the code. So it would be nice if C ++'s new T [x] returns a real array. The array QC has never converted anything. Like Delphi's array of T, C #'s T [], C ++'s array <T> or vector <T>, you can never convert a T array into an array of U, so this problem does not occur. ThereforeWhen using C ++, you don't need to commit something that STL has.It's not good to hurt your health ......

Then return to the const question mentioned at the beginning. We use const in C ++ for two purposes. The first is to use const references to organize C ++ to copy too many things, and the second is to use the const pointer to indicate that some values are not intended for you to touch. But we don't know what functions in a class will do, So C ++ adds const to the function. In this way, for a const T type, you can only call all functions marked with const in T. In addition, for member functions marked with const, its this pointer is also of the const T * const type, rather than the previous T * const type.

So how can we solve similar problems in C? First, the first problem is that it does not exist, because C # copies everything by bit, and your struct is the same no matter how it is written. Second, C # does not have the const type, so if you want to express a class that you don't want others to modify, then you have to extract the "const" parts and put them in the parent class or parent interface. So now in C #, besides the IList <T> type, there is also IReadOnlyList <T>. In fact, I personally think the IReadOnlyList name is not good, because this object may be a List under it and you are using it, because someone else has changed this List, causing your IReadOnlyList read to change, this creates confusion. So in this case, I 'd rather call him IReadableList. It is Readable, but you cannot hide the write interface.

So what is const doing? If it is a modifier type, it seems meaningless to change the parameter types of the function to const like below:

int Add(const int a, const int b);

Or, change the return value to const:

const int Add(const int a, const int b);

Then he and

int Add(int a, int b);

What is the difference? Maybe you cannot use parameters a and B as variables in the function. However, outside the function, there is no difference in calling these three functions. According to our usage habits, const should not be a type, but a variable. We do not want the IBitmap: Apply function to modify the filter, so the function signature is changed:

virtual void Apply(const IMatrix<float>& filter)=0;

We do not want to use macros to define constants, so we will write this in the header file:

const int ADD = 1;const int SUB = 2;const int MUL = 3;const int DIV = 4;const int PUSH = 5;const int POP = 6;

Or simply use enum:

enum class Instructions{    ADD = 1,    SUB,    MUL,    DIV,    PUSH,    POP};

For C ++, const will also affect the link. Static const member variables or const global variables of the integer value type can be written only in the header file to give a symbol without defining its entities in cpp. But for non-static const member variables, it occupies some positions of the class (C # const member variables are incompatible with static, and they are only a symbol, C ++ is totally different ).

Based on most people's understanding of const, we use const & or const * to modify a variable or parameter. For example, a temporary string:

const wchar_t* name = L"@GeniusVczh";

Or an array used to calculate hexadecimal encoding:

const wchar_t code[] = L"0123456789ABCDEF";

In fact, in our mind, const is generated to modify variables or parameters. To put it bluntly, it is to control whether the value in a memory can be changed (this is the same as volatile, the volatile of C # also carries the fence syntax, which is much better than that of C ++, which only controls whether the data can be cached into the register ). Therefore, the use of const in C ++ to modify types is an intuitive design. Of course, if we look at "C ++ design and evolution", we can find some reasons to explain why const is used to describe the type. However, from my experience, const has at least brought inconvenience to us.

The first is to make it harder for us to write a correct C ++ class. As mentioned in C #, a read-only list is actually different from a read/write list. In C ++, a read-only list is a read/write list that allows you to view and write functions but does not allow you to use. Generally, thousands of people are involved in a software program. I wrote a class today. You will write a template function with the const T & Parameter tomorrow. The day after tomorrow, he will find that these two things can be used together, however, upon compilation, it is found that all the member functions of the class do not carry the const results, and there is no way to do this. What should I do? Rewrite? We have to maintain an extra piece of code by ourselves. It may make the same mistake as the author of the original class. Do you know if adding const to a function brings problems to other parts of the super-large software, maybe just like a string class, some functions that are semantically const actually need to modify some member variables and you have to add mutable keywords to those things. After you modify the code, who will maintain it becomes a political issue unrelated to technology. Even if you understand what function requires const, When you declare a const variable, the const puts the wrong position, and some inexplicable problems may occur.

If we use the C # method from the very beginning and separate it into two interfaces, this is a bit out of place with C ++. Why? Why does STL like generic type + Value Type rather than generic type + reference type? Why does C # Like generic + reference type instead of generic + value type? In fact, no one is better than the two designs. C ++ and C # have different preferences. I think the reason should be GC. If the language has GC, you don't need to worry about when to delete it when you are new. The memory can be recycled cyclically, but it will never be used. C ++ does not work. Once the leak of the memory is used, the leak will always go down. So when we enter the new keyword in C ++ and C #, the mood is actually quite different. Therefore, we do not like to use pointers in C ++, but we are very happy with new in C. Since C ++ does not like pointers, it makes no sense to use something similar to IReadOnlyList <T> as a value type without a pointer, therefore, you simply add const to "prohibit some of the items in your category ". So whenever you write a class, you need to think about the problems described in the previous section. But not all C ++ programmers know all these details, so when we add up, there will always be silly times --Of course, this does not blame C ++. It is strange that it is too easy for you to submit an interview, so that some unqualified programmers may sneak in. Not everyone can use C ++.

The second problem is that, although we like to use const T & in parameters to avoid unnecessary replication, is this true for the return value of the function? Const is a double-edged sword on the returned value. I have written a linq for C ++ by myself and copied an IEnumerable and IEnumerator class. In the Current function, I return a const T &. Originally, the container's own IEnumerator wrote quite well, because the original returned items are in the container and there is an address. However, it is silly to start writing Select and Where statements. To correctly return a const T &, I have to return something with a memory address. Of course, I finally chose to cache the result in the member variable of SelectEnumerator during MoveNext.Of course, this is good because he forced me to put all the calculations in MoveNext, instead of writing them in Current.But in general, if it wasn't for me to write code, it might be time to fall into the trap.

In general, introducing const makes it more difficult to write a correct C ++ program. Const is not useless. If you are wondering when to use const or not, you can't use it in your own program. Of course, I am not saying that C language is better than C ++ if nothing happens.It is impossible for a language to delete anything to make it better.The abstract capability of C language is too low, so that I cannot do a good job in the logic part at all, I always have to worry about what kind of distorted methods these concepts will be used to express them in the C language. (I know you have chosen Macros in the end! Right! Right !), As a result, I became "annoying", and there were more bugs. I was too lazy to write the program at the end, and finally it became a mess.

Well, of course, if you say I don't have linus, I naturally can't say anything. However, C language is probably a good language that only linus can use. C ++ should at least use STL if you are in a good state of mind. The probability of dropping a hole is much lower than that of directly accessing C language.

Language traps are really difficult to write. I thought that two articles could be written, and the result is far from enough. This is the end of the article. In the next article, we will see the function pointers and lambda pitfalls that you will like to hear ......

To be continued

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

How to design a language (2) -- What is pitfall (B)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

How to design a language (2) -- What is pitfall (B)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support