C Pitfalls and defects learning notes

Last Update:2016-04-29 Source: Internet

Author: User

Tags case statement microsoft c

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Introduction to this book

Based on a paper published in the Bell Lab in 1985, the author has expanded his work experience to be a classic of value to C programmers. The starting point of writing this book is not to criticize the C language, but to help C programmers bypass the pitfalls and obstacles in the programming process.
The book is divided into 8 chapters, from lexical analysis, grammatical semantics, connectivity, library functions, pre-processor, portability defects and other aspects of C programming can be encountered in the problems. Finally, the author uses a chapter to give a number of practical value suggestions.
This book is suitable for a certain experience C programmer reading Learning, even if you are a C programming master, this book should also become your desk must book.

Objective

N years read this book, when reading swallowed, plus time, 90% of the content has been forgotten. Yesterday in the collation of books to turn out, the book is short but it is a classic. Take the time to read it again, by the way take notes, record the essence.

The first chapter, lexical "trap" 1.1 = different from = =

"=" is an assignment operator in C language
"= =" is a relational operator that is used to compare two numbers

1.2 & | Different from && | |

1.3 "Greedy method" in lexical analysis

This is said: if the compiler input stream is broken down to a character before it has been decomposed into a symbol, then the next symbol will include the longest string that might form a symbol after that character.
Example: = = is parsed as a comparison symbol by the compiler, and = = is interpreted by the compiler as two assignment symbols
a=b/*p; a=b/(*p), but * interpreted as the starting symbol of the comment by the compiler, either written a=b/*p or a=b/(*p)

1.4 Integral Type Constants

In C + +, the 8 binary integers need to be preceded by 0, such as 0122
In the decimal place, do not use 0 for text-formatted alignment.

1.5 Characters and strings

The characters in single quotes are treated as an integer by the compiler, regardless of the number of characters in the single quotation mark, such as ' a ' and ' abcd ' are considered integers.

unsigned int value1 = ‘tag1‘;   unsigned int value2 = ‘cd‘;   char value3 = ‘abcd‘;

After vs2013 compiles the above code, each variable results in the following:

value1= ‘t‘<<24 | ‘a‘<<16 | ‘g‘<<8 | ‘1‘=0x74616731 value2= ‘c‘<<8 | ‘d‘=0x00006364 value3=‘d‘=0x64

You can assign a value to value3 with more than one character, but Value3 retains only the value of the last character.
For value1 and value2, a maximum of four characters can be assigned, or less than 4 characters, or the compiler will give an error.

2nd Chapter Syntax "TRAP" 2.1 Understanding function declaration

This section focuses on the declaration, definition, and use of function pointers.
void (*pfunc) ();
typedef void (*PFUNC) (); The equivalent defines a data type Pfunc, which is a function pointer type, which is void (*) ();
PFunc F;
Use F in the following way:
(*f) (); The ANSI C standard allows shorthand to be simplified to f ();
Note that the above method of invocation cannot be written as follows:
*f (); Since the parentheses have the highest precedence, this approach actually becomes * (f ());
*f; Only one integer value is computed and no function call is made;
F Only one integer value is computed and no function call is made;

Priority issues for 2.2 operators

In programming, pay special attention to the precedence of operators.
The precedence of operators can be memorized using formulas, i.e. "single-shift-off vs., Xor"
"Single" denotes a single-mesh operator: logical non (!), bitwise inverse (~), self-increment (+ +), auto Minus (–), Fetch address (&), value (*);
"Count" means arithmetic operators: multiply, divide, and remainder (*,/,%) levels are higher than plus and minus (+,-);
"Move" indicates a bitwise Left SHIFT (<<) and a bitwise Right SHIFT (>>);
Off denotes relational operators: The size relationship (>,>=,<,<=) level is higher than the equal inequality relationship (==,!=);
"with" means bitwise and (&);
"XOR" means the bitwise XOR or (^);
"or" means bitwise OR (|) ；
"Logos" means logical operators: Logic and (&&) levels are higher than logic or (| |) ；
"Bar" means the conditional operator (?:);
"Fu" means the assignment operator (=,+=,-=,*=,/=,%=,>>=,<<=,&=,^=, |=,!=);
The comma operator (,) has the lowest level and no expression in the formula
(),[],-> These are not actually operators, the highest level

Note: The monocular operator and the conditional operator, the assignment operator, are combined from right to left ,
The remaining operators are combined from left to right .
Like what:
*p++ is interpreted as * (p++) instead of (*p) + +, because the + + operator uses a right-to-left binding

2.3 Note the semicolon as a statement end flag

In most cases, writing a semicolon compiler does not alert you, and the program can run correctly, and an error occurs when writing a semicolon compilation.
However, this can lead to serious bugs.
Like what:

Multi-semicolon case

if(x==y);   a=b;

This a=b is not affected by the IF statement.

Cases where semicolons are missing

if(x==y)   returnx=5;

If there is no semicolon after return, the return value will be changed from void to Integer 1, and in most cases it will be detected by the compiler, which would result in a very serious bug if the return value of the function is exactly an integral type.

struct ab{int x;int y;}main(){}

The struct struct body definition ends with a missing semicolon, and the return value of the main function becomes the struct type.

2.4 Switch statement

Do not forget the break statement at the end of the case statement in switch, you do not need to explicitly label the break statement when it is convenient for code maintenance.
I was a beginner C language in this issue has been a somersault, a code run results and expected inconsistent, debugging three days finally found that the problem is missing write break;

2.5 function call 2.6 "hanging" else raises an issue

else matches the nearest if.
For example:
if (x==y)
if (a==b)
printf ("a==b");
Else
printf ("x!=y");

if(x==y){  if(a==b)printf("a==b");}else printf("x!=y");

The results of the implementation will be different.

3rd. Semantic "traps" 3.1 pointers and arrays

There is only one-dimensional array in C, and multidimensional arrays are actually represented by a one-dimensional array.
The operation of the array subscript is equivalent to the corresponding pointer operation. A[i] and I[a] can be compiled and run properly. P=&a[0] is the same as p=a.
Except for the case where a is used as the operator sizeof, in all other cases the array name represents a pointer to the element labeled 0 in array A.

3.2 Pointers to non-arrays

char *r,*s,*t;r=malloc(strlen(s)+strlen(t));strcpy(r,s);strcat(r,t);

This code has three errors:
1. The return value is not checked for null after malloc
2. Note the difference between sizeof and strlen, Strlen calculates the number of characters included in the string, does not contain the trailing ' + ' characters, so allocating memory should be strlen (s) +1;
3. Allocated memory should be released in time to avoid memory leaks

3.3 An array declaration as a parameter

The C language automatically converts an array declaration of parameters in the formal parameter into the corresponding pointer declaration.

3.4 Avoiding "examples"

char *p,*q;p="xyz";q=p;

In this example, p and Q point to the same piece of memory, and copying the pointer does not duplicate the data pointed to by the pointer.

3.5 null pointer is not an empty string

You should be aware of the difference between a null pointer and an empty string when programming.
Like what:

char *p,*q;p=NULL;q=malloc(10);\*q=0;

In this example, p is a null pointer, q is not a null pointer, and q points to an empty string.
You can use the following code to determine if it is an empty string:

char *p;if(p!=NULL && *p){}

3.6 boundary calculation and asymmetric boundary

The subscript for arrays in the C language starts at 0.
The use of manipulation array elements in loops is very prone to cross-border issues and requires special attention.

3.7 Order of Evaluation

In C, the order of Evaluation for |, &&,?: , is defined.

The C language stipulates that the left operand must be evaluated first, and then the right operand is evaluated as needed;
The short-circuit properties are utilized as required, i.e.:
A&&b, when a is false, returns false without calculating the value of B, and evaluates the value of B when a is true.
A | | b, when a is true, returns True if the value of B is not evaluated and when a is false, the value of B is computed.
Like what:

if(count!=0 && (a/count)>2){}

Evaluates the expression on the left to determine if Count equals 0,count is not 0 to calculate a/count. In this case, the order of evaluation is critical, otherwise the exception of 0 is thrown.
In an example such as:

char *p;if(p!=NULL && *p){}

First, determine if p is a null pointer and then determine if it is an empty string.

For the ?: This conditional operator, for example: A?B:C

Operand A is evaluated first, and the value of operand B or operand C is evaluated based on the value of a, and only one of the values is computed.

The comma operator , which evaluates the left operand first, then the value is "discarded" and evaluated on the right operand.

The order in which all other operators in the C language evaluates their operands is undefined. In particular, assignment operators do not guarantee any order of evaluation.
For example:

i=0;while(i<n){    y[i++]=x[i];}

In this example, the i++ in the left-hand operand may be executed first or possibly more than x[i] after the x[i.

3.8 Operators &&, | | And!

Note the difference between logical and bitwise AND, logical, or OR.
The logic operation has short-circuit nature, while bit operation does not have this property.

3.9 integer Overflow

The effect of integer overflow should be taken into account when doing arithmetic operations. If the overflow has an effect on the result, it should be judged.
The method of judging is:

//把a，b两个数均视为无符号数if((unsigned int)a+(unsigned int)b>INT_MAX){}

There are related macro definitions in the Limits.h header file:

#define INT_MIN     (-2147483647 - 1) /* minimum (signed) int value */#define INT_MAX       2147483647    /* maximum (signed) int value */

3.10 provides a return value for the function main

The main function returns an integer value by default if the return value is not declared. This return value is used to inform the operating system of the execution result, return 0 for success, and return non 0 to indicate failure.
This is not a problem in most cases. However, if the system is concerned with the result of the execution, it must explicitly return a meaningful value.

4th Chapter Connection

4.1 What is a connector
4.2 Declaration and definition
4.3 Naming conflicts with the static modifier
4.4 Formal parameter, Real participation return value
4.5 Checking external types
4.6 Header Files

The fourth chapter is mainly about naming conflicts caused by multiple file links.
To solve this problem, you can use the static modifier to decorate the function and the global variable, and the scope of the function and global variable that is modified by the static is limited to the source file that defines it and is inaccessible within other source files.
If you need to share the same global variable within multiple source files, it is a best practice to declare extern int A in the header file, and then define int a in a source file;
Also, when using global variables in multiple source files, make sure that the variable types are the same.
Consider the following program, which contains the definition in a header file:
Char filename[]= "\ETC\PASSWD";
In another file, you include a declaration:
extern char* filename;
Although in many cases, filename is degraded to a pointer when used, so encoding can be compiled, but there are problems.
For example: Enter the following code in ConsoleApplication1.cpp

#include "stdafx.h"#include <stdlib.h>#include <stdio.h>#include <string.h>char filename[] = "abcdefgh";unsigned int myfunc();int _tmain(int argc, _TCHAR* argv[]){    unsigned int i;    for (i = 0; i < strlen(filename); i++)    {        printf("%c", i[filename]);    }    myfunc();    system("pause");    return 0;}

Enter the following code in Source.cpp:

#include <stdlib.h>#include <stdio.h>#include <string.h>extern char *filename;unsigned int myfunc(){    unsigned int i = 0;    char b = *filename;    return b;}

When you compile and run using vs2013, the memory pointer is out of bounds.
char B = *filename; the corresponding assembly code for this line is as follows:
001314B5 mov eax,dword ptr ds:[00138000h]
001314BA mov cl,byte ptr [eax]
001314BC mov byte ptr [b],cl
As you can see, the compiler recognizes filename as a char * * type.
If the source.cpp is declared as extern char filename[], the above problem will not occur.

5th Library Function 5.1 Returns the GetChar function of an integer

The return value of the GetChar () function is an integer type, not a character type.
Because the GetChar function returns EOF, which is generally defined as 1 in the function library, in addition to the characters entered by the terminal, when encountering Ctrl+d (Linux), or the file terminator, EOF, is the EOF of GetChar ().

char c;while((c=getchar())!=EOF){}

It is likely that the problem will occur, and the GetChar return value assigned to C will be truncated.

5.2 Update Order File

When you call the Fread and fwrite functions in the C language to manipulate files alternately, you should call fseek one time between two function calls.

5.3 Buffered output and memory allocation

int main(){    char buf[512];    setbuf(stdout,buf);    while((c=getchar())!=EOF)        putchar(c);}

The program calls the library function setbuf, notifies the input/output data first cached in buf, the main function returns, BUF is released, control to the operating system before the C run-time library to clean up the work, will refer to the BUF has been released.

5.4 Using errno to detect errors

When using errno, this variable detects errors by first determining that the program has failed to execute.
For example, call the fopen function to open a file, even if the file opened successfully, errno the value of this variable will be set.

5.5 Library function signal 6th chapter Preprocessor

6.1 Cannot ignore spaces in macro definitions
6.2 Macros are not functions
6.3 Macros are not statements
6.4 Macros are not type definitions

The main aspect of this chapter is the content of the macro definition:
The essence of a macro in C is the substitution of text before compilation. Therefore, most macro definitions are enclosed in parentheses to prevent the order of operations from being expected to be inconsistent due to operator precedence issues after text substitution.
Try to avoid using the + + and – operator in the macro definition, or change the value of the pointer in the macro definition,

\#define toupper(c) ((c)>=‘a‘ && (c)<=‘z‘? (c)+‘A‘-‘a‘:(c))toupper(*p++);

The result will be a surprise.

7th. Portability Defect 7.1 should be limited to C language standard Change 7.2 identifier name

The first character of an identifier name cannot be a number (that is, the first character must be an underscore, uppercase, or lowercase letter). ANSI allows an external identifier name to contain 6 valid characters, and an internal (in a function) identifier name contains 31 valid characters. An external identifier is an identifier that is involved in the linking process, including the function name and global variable name that are shared between files.
An internal name refers to those identifiers that appear only in the file that defines the identifier.
The following excerpt from MSDN:
Although ANSI allows an external identifier name to contain 6 valid characters, the internal identifier (within a function) name contains 31 valid characters, but the Microsoft C compiler allows an internal or external identifier name to contain 247 characters. If you are not concerned about ANSI compatibility, you can modify this default value to a smaller or larger number by using the/h (limit the length of external names) option. External identifiers (declared at the global scope or with storage class extern) may be constrained by other naming restrictions, because these identifiers must be handled by other software, such as a linker.

7.3 Size of integers

On different compilers, the length of an integer is not exactly the same, it could be 2 bytes, or it could be 4 bytes, or even 8 bytes, to be aware of when porting the code.

7.4 characters are signed integers or unsigned integers

Converts a char variable to an int variable, and the compiler automatically signs the bit extension. Especially when the highest bit of the character variable is 1 o'clock, pay special attention to this problem.
unsigned int a = 0;
int b = 0;
int d=0;
char C = 0x80;
A = C; In this notation, C is first converted to the int type, because of the presence of the sign bit extension, a=0xffffff80
b = C; In this notation, C is first converted to the int type, because of the presence of the sign bit extension, b=0xffffff80
D = (unsigned char) c;//c is an unsigned type, so no sign bit extension is performed, d=0x00000080

7.5 Shift Operators

In assembly language There are logical shifts in SHL, shr, and arithmetic shift SAR, Sal, where SHL and Sal execute the same result, but the SAR treats the operand as a signed number, the result is affected by the sign bit, and the SHR treats the operand as an unsigned number.
Only left shift << and right shift >> in C language;
When the operand is a signed number,>> is implemented with SAR, when the operand is an unsigned number,>> is implemented with SHR.

In the C language, the number of shift bits must be greater than 1 or less than 33, or the compiler will error.
negative numbers in a signed number right shift operation is not equal to a power divided by 2.

7.6 Memory Location 0

This location is readable in some systems and not readable in some systems (Windows).

7.7 truncation occurs when a division operation
7.8 The size of the random number
7.9 Case Conversion
7.1 First release, then reassign
7.11 An example of portability problems

The 8th chapter of the proposal and answer

8.1 Recommendations
8.2 Answers

C Pitfalls and defects learning notes

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More