[Character encoding]
In computer systems, all data is encoded. There are multiple encoding methods in the computer, the most common ones are:
1. Unsigned encoding: Binary-based encoding, indicating a number greater than 0 or equal to 0
2. Binary complement: The most common form of signed integers. It can be negative, 0, or positive.
3. floating-point number encoding: The Scientific notation that represents the real number based on 2;
The floating point number can be thought as follows: the tail number * 2 ^ power. Through this expression, we can basically see the storage mode of the floating point number in the computer.
For example, in C, float real numbers,
Bit15...... bit9 bit8...... bit0
The bit15-bit9 can be used to represent the ending number, while bit8... bit0 can represent the power of 2, and then through the System Conversion you can
Float x bin --> float x Dec
[Encoding attributes]
To design an encoding system, you must implement various attributes:
1. Value Range
2. Attributes of arithmetic operations, or operations that can be implemented
3. Bit-level representation in the system, that is, the storage mode
[Hexadecimal]
Common examples include binary, decimal, octal, and hexadecimal; commonly used decimal in daily life; commonly used binary and hexadecimal in computer systems.
[Word]
In computer systems, the length indicates the integer and nominal size of the pointer. Generally, the full length is related to the data bus and the address bus.
The address space that a computer with n characters can access is 0--2 ^ n-1,ProgramA maximum of 2 ^ n Bytes can be accessed.
Note:
I remember when I learned the principles of the microcomputer, I mentioned the concept of word length as follows:
Title: microcomputer principles and interface technology (Third edition) Zhou Heqin China University of Science and Technology Press
In my opinion, "Understanding computer systems in depth" is like this:
Title: Computer _ in-depth understanding of computer systems]. (US) Randal. E. Bryant & davic. O. hallaron. Scan version
I am confused at this point, because these two concepts are inconsistent, so what should we do?
Let's look at the context directly.
[Data type size]
Short INT: generally two system bytes.
INT: generally four bytes,
Long INT: Generally, the full length is used, that is, the width of the sizeof (long INT) is the same as that of the system address line. This is also related to the OS design or compiler design.
Float: generally four bytes
Double: generally eight bytes, Double Precision
Char *: Full font Length
The c Standard specifies the lower bound of each data type, but does not specify the upper bound. Generally, the int range is not smaller than the short int, and the long int range is not smaller than the int range.
Addressing and byte order]
For objects accessed in the program, we need to pay attention to the address of the stored object, the byte sequence of the Multi-Byte object, and the stored content; A single byte generally focuses on the storage address and content;
For example, for a char data object, we generally focus on the address and content of the data object, rather than the storage method of the object bytes.
Multi-byte objects are generally processed in the following way: the storage is a continuous byte sequence, and the minimum address of the Multi-byte is the object address.
For example, we have an int (32-bit) data object:
Int ntest;
Ntest occupies 4 bytes, for example:
The Data Object ntest occupies consecutive bytes 0x100, 0x101, 0x103, 0x104, and its address is 0x100.
Speaking of this, we cannot talk about a basic concept in a computer: Small-end and large-end.
Suppose we have a W-bit object, represented in binary as [Xw-1, Xw-2, Xw-3 .......X3,X2,X1,X0], W is a multiple of eight, so in the stored process, these bits are
Is divided into byte groups, each 8 bits into a group, [Xw-1, Xw-2, Xw-3 ......XW-8] Is the maximum valid 8-bit MSB, and 【X7 .....X2,X1] Is the minimum valid 8-bit LSB.
The size of the basic unit room is 8 bits. In this way, other bits such as MSB and LSB are stored in a forward and backward order, and there may be two possibilities. As shown in:
Small End: When expressed in hexadecimal notation, the part with a large weight is placed in the high-end part of the address, as shown in method 2.
Big end: When expressed in hexadecimal notation, the part with a large weight is placed in the low-end part of the address, for example, method 1. The big end notation is the same as the natural number representation, that is, the number is stored in the computer according
Storage of data in writing mode.
Large and small ends generally do not affect the reading, writing, and display of programs. However, when the byte block communication occurs, it affects the size of the values interpreted by the sender and receiver. Also, when the byte after forced type conversion is used
.
Example: 0xfbca_13b4
Its small-end storage mode is B4 13 ca FB
The storage format of the big-end storage mode is: FB ca 13 B4
Currently, most IBM, Motorola, and Sun Microsystems use the big-end storage method, while Intel and PC compatible computers use the small-end method, and some computers can work.
In large and small ends, the specific mode of work depends on the storage rules specified when the system powers up.
[String]
In the C language standard, the string is a continuous character sequence, and the Character Sequence ends with an empty character '\ 0'. The length of the string does not include the final control character' \ 0 '; in many cases, null is used to represent '\ 0 ';
In this way, you can define a macro:
# Define null ('\ 0') // note that null and null have different properties, although they are equal in numerical values.
Special definition:
# Define null (void *) 0) // here 0 has the pointer feature
The string can be regarded as a character array technically. It is stored continuously in the computer, and the lowest address in the storage area stores the first character of the string.
The most common character encoding method is ASCII encoding.
Tip: decimal 0, 1, 2 ...... 9, corresponds to the hexadecimal encoding 0x30, 0x31 ..... 0x39; that is, the ASCII code value of the number 0 is (48) dec.
The string is referenced in double quotation marks, for example, the string "ABC", null string "";
The string can be connected. For example, "abcdefg" "hijk" indicates the string "abcdefghijk"
The string itself returns its first storage address. For example, char * strtest = "ABC"; then we can use strtest to reference characters and strings in the characters.
--- Restore content end ---