Introduction to int in Unix and Linux

Source: Internet
Author: User

This is also a point essay by peaceful students. He is puzzled by the fact that many api functions use int as a common type in Linux. Ask me, I will try to answer it. Original post here: http://student.csdn.net/space.php? Uid = 121080 & do = thread & id = 9168 the problem is as follows:According to the <c ++ primer> suggestion, it is recommended that size_t be used when a variable like "capacity" cannot be negative. However, in linux, int is preferred.Why not use int instead of size_t (or unsigned int? Does it mean that the range is half?My answer: Well, this is a relatively poor question. However, I have worked on Windows development, Linux development, and both platforms. That's no sense, I just want to talk about my own feelings. Well, it's not necessarily accurate. You can also add it if you have high opinions. I think this is a cultural issue first. What is culture is a common consensus among people who do such things, that is, everyone is used to doing things like this. I found this difference very early in when I switched to Linux development. Windows is developed by Microsoft, a large company that emphasizes a rigorous development style. You can see from its respect for the Hungarian naming law. It strictly specifies variable naming and type naming. It must be as accurate as possible without ambiguity. For example, many Struct * will be explicitly named as a new type of PStruct for management using typedef, so that you can literally see it at a glance, instead of using it, go to the stars. It is easy to count the stars incorrectly. I did bad things, hey. This principle is also very simple. Microsoft developed the OS. To put it bluntly, in addition to the Windows operating functions, it also needs to provide a large number of APIS for programmers, if no one develops an application for his operating system, his OS cannot be sold. This requires Microsoft not only to focus on the experience of end users, but also the user experience of programmers. The explicit api is obviously a good user experience and programmers are not easy to make mistakes, the api provider is bound to do the right thing. If programmers have fewer bugs, the success rate will be high, and the development cost will be low, a virtuous circle will be formed. At the same time, this reduces a lot of complaints for Microsoft's customer service department. If Microsoft's api is vague, do you have to worry about it? Haha. This shows a very important idea. Microsoft treats the majority of application programmers as end users who do not understand anything and tries to build the largest degree of development friendliness from the api, therefore, it is very strict with the naming method, and the api name is clearly defined. At the same time, it is annoying for various variables and types and is defined multiple times to enable programmers in various industries, try to fit your industry habits. This makes sense. For example, in the current power system, we define variable types and like to use Int16, Int32, Int64, Float16, Float32, Float64, c/C ++ basic types such as int and double are not very popular. Why? In many cases, people at the industrial site may go wrong if they don't know how many bits your type is. This is for the sake of clarity and industry habits. Therefore, before I start to write a program, I need to define such a batch of new variable types to facilitate code communication with my colleagues. Unix is not the same. I have read the book "Unix programming art", which covers the culture of many Unix programmers. Let's just put it down. For Unix developers, the default users are programmers at the same level. All languages of communication are specialized in Computer Science. Most people often use the default language to imply that, that's all. Therefore, there are not so many variable types that are categorized in Unix, so we also have a good habit. I have seen that almost all Unix functions are the basic library of gcc. The functions are always int and int. In fact, it gives me the feeling that in Unix, everything is int. Why, unix focuses on Array Management of similar resources. For example, the opened file handle is an integer, process ID, integer, thread ID, integer, user ID, integer, or even, all devices are integers. Well, the socket is a specific data type SOCKET in Windows, and in Unix, you guessed it, right, int, integer. What is this? In fact, it is vectorized management. During the internal retrieval of the system, it can be imagined that developers in Unix systems use arrays in large quantities and use the int integer in the hash retrieval target to achieve the highest efficiency. Anyway, no matter what resources, in Unix, it is an int-type ID representation, which is actually a Windows handle concept. The two naming methods have their own advantages and disadvantages. There are many variable types in Windows, and the cost is high when programmers learn, but it is not easy to make mistakes when they learn. The learning curve of Unix is low, and there are not so many types of names to be backed up. However, there are no mistakes in use. Self-care! Unix developers believe in everyone's strength. Well, I just can't trust myself, hey. However, if I want to talk about it, it is really difficult to say who is right or who is wrong. In fact, it all makes sense. The key is to look at the OS designers and the level of target users in their minds. Back to the question of peaceful students, it makes sense to use int, half less, because int is signed and half of it indicates that the range is negative. When it indicates a lot of capacity, for example, the array of malloc, or the range indicated by socket, is indeed negative. Therefore, it seems to be wasted. This is actually not the case. The reason is very simple. If int Is less than half, how much can you use? When you ask this question, it means that there are still some students' thinking and they always want to make better, that is, I want to include all the resources in my management, haha. In fact, we have been doing our work for a long time. We have an idea about the data boundary and scope. It is good enough and there is no need to do more. Int indicates half, 2 GB, right? Well, most users' arrays may exceed 1 GB? In fact, most of the time, we use int in this place to represent a very small range. Half is enough. Leave the socket blank. In theory, there is only 65536, And it is enough to have a few more than half of the int. Enough is enough. You don't have to worry about it. If you really need to use a big representation, you can do it yourself as unsigned long. Moreover, the biggest advantage of Unix design is that the self-description feature of data has never been used before. We all know that when the range and capacity are expressed, int only uses the positive part. What is the negative part? It is very important to answer your question, which indicates that the value is invalid. Think about it. If the data type we use cannot represent invalid values, it is troublesome. During api design, we must design a separate parameter to pass invalid values, this also involves the & Address Transfer call, or * Direct pointer transfer. The complexity of the program design increases linearly. The design is also exhausting, and the learning is also exhausting. This may not be intuitive. For example, I want to design a read function to read a piece of data from a file id. in Unix, the file ID can represent any serialized device. Including socket, you can use Berkeley socket APIs such as resv, or you can directly use the read of the C basic library, because int represents, Unix can handle all serialized devices in a unified way, use a type of function to complete processing. I designed the function prototype in this way and designed two functions. Let's compare them: Code:

  1. Int ReadFrom (int fd, char * szBuffer, int nBufferSizeMax );
  2. Unsigned int ReadFrom (int fd, char * szBuffer, int nBufferSizeMax );
Yes, though the Hungarian naming method is used, this is a standard C function, right. Well, the first one should be Unix, int, and Windows, and a non-Signed positive integer. Well, let's look at a situation. If read fails, an error message is returned to the upper-layer caller. What should we do? The first one is very simple, because it only uses the positive part of the int to indicate the number of bytes of successful reading, well,-1 indicates failure. The second one is troublesome. Because it only returns a positive integer, the upper layer is correct and it cannot return an error mark. To put it bluntly, this api design fails and cannot meet the return requirements of all applications. What should we do? There are also two methods: Code:
  1. Unsigned int ReadFrom (int fd, char * szBuffer, int nBufferSizeMax, bool & bSuccessFlag );
  2. # Define READ_FAIL 0 xFFFFFFFF
  3. // If an error occurs, READ_FAIL is returned.
  4. Unsigned int ReadFrom (int fd, char * szBuffer, int nBufferSizeMax );
Do not use 0. In many cases, Read to 0 is not an error. It is a normal state, such as socker's read and read to 0 Bytes. It is likely that the other party has not sent the message, it does not mean that the socket fails and the link needs to be rebuilt. Well, let's take a look at it. First, there is no way to design an additional address-passing parameter so that the function returns the mark of whether the read is successful. The upper layer decides how to handle it. The second is to explicitly define a return value as an error, which is never understood as a correct value, that is, the correct service is not used and is only used to return an error. Compare Win32api. Many of them use the second method. However, in this way, you will find it troublesome. First, at least one more parameter is required, and the programmer's learning volume increases. Remember, a function is easy to learn, and 100 and 1000 functions are supported, you think this bSuccessFlag is annoying. The second is more rude. First, it limits the business and leads to business defects. At least, the maximum value indicated by unsigned int is used for fail, that is, the business cannot be used. Second, if a macro is added and the return value is unsigned int, it must be treated differently. Do you think it is troublesome to use it? Don't worry about the trouble. Well, let's look at the Win32Socket function. Its SOCKET uses the maximum value represented by the unsigned int as the wrong SOCKET identifier, in fact, the api is designed according to the second method above. No way, who asked him to define the SOCKET as a non-Signed positive integer. Give it to yourself. Well, what else are the beginthreadex and CreateWindowsEx functions? Let's look at the data definition of HANDLE to see why the parameter design is so complicated. Most of the time, we can see that the api is not easy to use. In fact, we can look at these details. Let's talk about it more. It is precisely because many api functions in Unix and Linux use int as return values.-1 has been recognized as an invalid sign of failure. It is easy to use. Generally speaking, when I see a Unix api function, I guess how it indicates the return value. It's all int,-1 is a failure, and 0 and positive are all successful. Windows is troublesome. It takes half a day to check MSDN. It mainly depends on its data type and the corresponding interface definition. Therefore, although Windows has made a lot of efforts and made a lot of friendly work for programmers, sometimes, I comment, it is really worse than Unix to give everything to programmers. However, if the application programmer is not a computer professional, for example, another professional engineer needs to temporarily write a short program to solve the problem, windows ensures that no error occurs. Why? The data type definition is wrong. If compilation fails, check MSDN on your own. So, let's talk about it. I think this problem is still a cultural issue. The development culture of the two platforms has led to today's situation. It cannot be said who is good or bad. Let's take a look. Well, I personally like Windows's Hungarian naming method, and I like rigorous naming. However, I do not like so many data types, but I still like int. Many times, the apis I designed are a bit strange. In fact, they come from this and use the naming method of Windows. However, the design of api function parameters is more dependent on Unix, consider both sides. In 0bug-C/C ++ road to commercial engineering, my engineering library uses my habits and both of them, because I think that no matter whether it's a white cat or a black cat, catch the mouse is a good cat, where to use it together, which method to use it together, there is no door to the door. This is not absolute. The above is just my personal habit. You have to choose a proper method. It is not necessary to share it with me. ========================================================== ====================
Buy 0bug-C/C ++ commercial engineering at the reserve price online
Directly click the link below or copy it to the address bar of the browser)
Http://s.click.taobao.com/t_3? Limit % 3 Fpid % 3Dmm_13866629_0_0 Xiao Miao

This article from the "Xiao blog" blog, please be sure to keep this source http://tonyxiaohome.blog.51cto.com/925273/309032

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.