In-depth understanding of C language-02-Data Encoding

Last Update:2014-03-11 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

In information system modeling, the first step is information encoding, that is, how information is stored in a computer.

For the sake of simplicity of hardware design, we usually use binary chips. In addition, due to technical limitations, the Data Length is also limited.

For example, the data bus of most computers is 32-bit/or 64-bit. Taking a 32-bit system as an example, the size of the encoded set is 2 to the power of 32, that is, 4294967296.

Obviously, this is a finite set. In reality, simulation information is usually an infinite set.

This involves information encoding, that is, establishing a ing function: f (information) = Information encoding in the computer.

The encoding Design of Information involves data size selection. In actual projects, we usually select a compromise value for the current requirements and later extensions.

For the C language, we mainly consider the following three aspects:

Data type size

Data in byte order

Data Alignment

The difference in data type size is a major problem in the portability of C language.

Therefore, before using the new system, you need to refer to the chip Manual (DataSheet) or write the following code for testing.

Printf ("char [% d] short [% d] int [% d] long [% d] long [% d] float [% d] double [% d] \ n ",
Sizeof (char), sizeof (short), sizeof (int), sizeof (long), sizeof (long ),
Sizeof (float), sizeof (double ));

In a 32-bit system, the following results are generally returned:

Char [1] short [2] int [4] long [4] long [8] float [4] double [8]

Data in byte order

Generally, the x86 architecture uses Little-Endian, while the MIPS architecture uses BigEndian ).

For example, 0x12345678 is stored in the order of 0x78 0x56 0x34 0x12 in x86,

In MIPS architecture, data is usually stored in the order of 0x12 0x34 0x56 0x78.

You can write the following code to test the size end:

Int checkEndian ()
{
Int x = 0x12345678;
If (* (char *) (& x) = 0x78 ){
Printf ("Little Endian \ n ");
Return 0;
} Else {
Printf ("Big-Endian ");
Return 1;
}
}

Data Alignment:

Why alignment? On the one hand, it is for performance consideration, and on the other hand, it is the limitation of chip design.

For example, the company's MIPS chip requires that the data address must be 4-byte aligned, otherwise it will cause a Bus Error)

Of course, alignment will inevitably lead to a waste of memory according to the contradiction between time and space. In this way, the cost of hardware is increased when E2ROM is used.

How to save? One way is to compress data at the expense of time. Another more convenient way is to control alignment by yourself.

Of course, the best design is to consider alignment as a whole and arrange the data location reasonably.

The method for controlling alignment is related to the compiler. Generally:

Gcc

_ Attribute _ (aligned (n ))

Vc adopts:

# Pragma pack (n)

You can write the following code for testing:

VC:

Typedef struct stTest1
{
Char ch;
Int x;
Short y;
} Test1;

# Pragma pack (1)
Typedef struct stTest2
{
Char ch;
Int x;
Short y;
} Test2;
# Pragma pack (4)

Gcc:

Typedef struct stTest1
{
Char ch;
Int x;
Short y;
} Test1;

Typedef struct stTest2
{
Char ch;
Int x;
Short y;
} Test2 _ attribute _ (aligned (1 ));

Log Code:

Printf ("sizeof (Test1) = % d, sizeof (Test2) = % d \ n", sizeof (Test1), sizeof (Test2 ));
Printf ("_ alignof (Test1) = % d, _ alignof (Test2) = % d \ n", _ alignof (Test1 ), _ alignof (Test2 ));

Output:

Sizeof (Test1) = 12, sizeof (Test2) = 7
Alignof (Test1) = 4, _ alignof (Test2) = 1

Next is the data representation.

We usually use the following methods:

1> use bit to design the storage location and size of each data.

For example, a deck of playing cards has four colors, each of which has 13 values, so that we can encode them as follows:

1 byte has 8 Bit, 0-3 4 Bit represents the A-K, 4-5 2 Bit represents the color, size King special encoding.

8 bits 7 6 5 4 3 2 1 0

-- -----> 0001-> A, 0001-> 2,..., 1101-> K

--> 00-> Hongtao 01-> taotao 02-> heimei 03-> hongfang

King: 000000

Wang: 111111

2> use the basic data types (char, short, int, unsign int, float, and double) provided by C language, and use arrays and struct to construct complex data structures.

For non-linear structures, pointers are also required.

Before using the basic types, follow the C language design philosophy (who uses who is responsible for the concept), you must understand the following:

1> representation range of various formats

2> what is the binary value of the data?

3> is there any loss of precision or data overflow? How to judge?

2.1 indicates the range

Taking an unsigned integer as an example, in a 32-bit system:

Unsigned char 8 bits indicate 0 ~ 255

Unsigned short 16 bits indicate the range is 0 ~ 65535

Unsigned long 32-bit indicates the range is 0 ~ 4294967295

Unsigned long 64-bit indicates the range is 0 ~ 18446744073709551615

These values do not need to remember the exact values. You only need to know the level.

The specific values can be learned through the following code:

# Include

Int testLimit ()
{
Printf ("min of char: % d \ n", SCHAR_MIN );
Printf ("max of char: % d \ n", SCHAR_MAX );
Printf ("min of short: % d \ n", SHRT_MIN );
Printf ("max of short: % d \ n", SHRT_MAX );
Printf ("min of int: % d \ n", INT_MIN );
Printf ("max of int: % d \ n", INT_MAX );
Printf ("min of long: % d \ n", LONG_MIN );
Printf ("max of long: % d \ n", LONG_MAX );
Printf ("min of long: % llu \ n", LLONG_MIN );
Printf ("max of long: % llu \ n", LLONG_MAX );

Printf ("max of unsigned char: % d \ n", UCHAR_MAX );
Printf ("max of unsigned short: % d \ n", USHRT_MAX );
Printf ("max of unsigned int: % u \ n", UINT_MAX );
Printf ("max of unsigned long: % u \ n", ULONG_MAX );
Printf ("max of unsigned long: % llu \ n", ULLONG_MAX );
Return 0;
}

Output on a 32-bit system:

Min of char:-128
Max of char: 127
Min of short:-32768.
Maxof short: 32767
Min of int:-2147483648
Maximum of int: 2147483647
Min of long:-2147483648
Max of long: 2147483647
Min of long: 9223372036854775808
Max of long: 9223372036854775807
Max of unsigned char: 255
Max of unsigneshort: 65535
Maximum of unsigned int: 4294967295
Max of unsigned long: 4294967295
Max of unsigned long: 18446744073709551615

For floating point numbers, see the float. h header file. Baidu Library:

Double:

Exact digits after decimal point of DBL_DIG double
DBL_EPSILON minimum ending number (1.0 + DBL_EPSILON! = 1.0)

Number of digits in the DBL_MANT_DIG ending number
DBL_MAX maximum

DBL_MAX_10_EXP maximum 10 Base Index
DBL_MAX_EXP maximum Binary Index

Minimum DBL_MIN Value

DBL_MIN_10_EXP minimum 10 Base Index
DBL_MIN_EXP minimum Binary Index

Float:

Precise digits after the decimal point of FLT_DIG float
The smallest ending number of FLT_EPSILON (1.0 + FLT_EPSILON! = 1.0)

Number of digits in the end of FLT_MANT_DLG
FLT_MAX maximum

FLT_MAX_10_EXP maximum 10 Base Index

FLT_MAX_EXP maximum Binary Index
FLT_MIN min

FLT_MIN_10_EXP: Minimum 10 Base Index
FLT_MIN_EXP, minimum Binary Index

FLT_RADIX base FLT_ROUNDS addition rounding

Long double:

Exact digits after the decimal point
The smallest ending number of LDBL_EPSILON (1.0 + LDBL_EPSILON! = 1.0)
Number of digits in the ending number of LDBL_MANT_DLG

Maximum LDBL_MAX Value
LDBL_MAX_10_EXP maximum 10 Base Index

LDBL_MAX_EXP maximum Binary Index

Minimum LDBL_MIN

LDBL_MIN_10_EXP minimum 10 Base Index

LDBL_MIN_EXP minimum Binary Index

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

In-depth understanding of C language-02-Data Encoding

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

In-depth understanding of C language-02-Data Encoding

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support