Basic file data knowledge

Source: Internet
Author: User

Basic file data knowledge:

1) In a computer with a IA-32 architecture, the smallest unit of data storage is "bit" (one can represent two states: 0 and 1 );

The CPU can be divided into 8-bit, 16-bit, 32-bit, and 64-bit based on the number of digits that the CPU processes at a time.

2) the CPU is designed to be good at processing the n-power digits of 2. Therefore, using the CPU to process n-power digits of 2 will disrupt the pipeline, resulting in inefficient command execution, therefore, data is generally not stored using a number of non-2 N-power digits.

3) generally, in order to achieve high processing efficiency, the minimum unit of CPU data processing is 8 bits. If you want to process less than one byte (8 bits) of data, the CPU usually uses shift commands and logical commands to extract BIT data, which results in very low processing efficiency. Besides, the memory stores data in bytes.

4) organize different bytes of data together to form a file. Byte is the smallest unit that can be processed by the file system. The analysis of the file format is mainly on the byte level.

5) the number of bytes occupied by INTEGER (INT), char (character), and long (long) compiled by Visual C ++ 6.0 are: 2, 1, and 4, respectively.

6) a string has two main attributes: one is the length of the string and the other is the data of the string.

There are two common string storage formats:

6.1) Length + character data;

Strings in this format are usually compiled using Pascal or Delphi. The advantage is that you can know the length of the string before processing the string. The disadvantage is that you usually need to use more than 1 byte space to store the length of the string (because the value range of 1 word is 0 ~ 255, so if only one byte is used to store its length, the maximum length of the string can only be 255 bytes );

6.2) character data + Terminator;

This format is the most widely used string. C, C ++, and Java are all in this format. The advantage is that no matter how long the string is, only one byte is required to store the Terminator. The disadvantage is that the length of the string can be known only when the entire character array is traversed.

7) Finally, let's talk about the storage sequence of file data, that is, the so-called "Big Head" and "small head ". According to the habit, people put a high value on the left and the right; but Intel CPU designers use the opposite data storage order, that is, put the low to the left and the high to the right, this storage sequence is called Little-Endian, while the traditional recognition sequence for values is called Big-Endian.

Most advanced languages process data according to little-Endian, such as C, C ++, VB, and Delphi; some languages, such as Java, process data in the big-Endian order, Because Java uses its own virtual machine to process data.

A single byte does not have to be divided into "Big Head" or "Small Header", because the computer memory is measured in bytes, that is, the minimum storage of one byte.

Unlike intel CPUs, Motorola's PowerPC series CPUs use big-Endian sequence to store data.

In addition, the network protocol also uses big-Endian to transmit data. Therefore, when transmitting data between different computers, remember to convert the byte sequence when sending and receiving data.

 

Appendix:

Ia32: 32 bits intel architecture (32-bit bandwidth intel architecture)
IA64: 64 bits intel architecture (64-bit bandwidth intel architecture)


I386: Intel 386 (the old 386 machine, also refers to the ia32 system CPU)
IMG: Intel 486
I586: Intel 586 (Pentium, K6-level CPU)
I686: Intel 686 (Pentium II, Pentium III, pentim 4, K7 CPU)

The above 86 can also be called x86. Generally, x86 also refers to the ia32 architecture CPU.
X86 is an Intel general computer series number and identifies a set of general computer commands.

 

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.