Some friends are always consulted about binary file read-write and conversion. Here is my own understanding to say.
a). Generic problem binaries are fundamentally different from the way we typically store text files. This difference is difficult to express in words, to see for yourself, it will be much easier to understand. Therefore, I recommend a hexadecimal editor for a friend who learns to read and write binary files. There are many such editors, and in the integrated development environment that comes with our CVF, you can (drag a binary file to the IDE window and release it). Visual Studio 2005 is also available. (However, you need to open,file under the File menu) a more used software is recommended, called UltraEdit (hereinafter referred to as UE). is a very good text editor, can also be used as a hex editor. Why use the hex editor? Without a 2-step system? Because 2 is too small to write, it will be very long, very not intuitive. And our computer takes 8 bits as a byte. Just 2 * * 8 = 256 = 16 * * 2. With 8-bit 2-digit representation, we use 2 hexadecimal data to express it, which is more intuitive and convenient.
II). file format All files, in general sense will be divided into two categories, one is a text file, a class of binary files.
1). text FileText files are opened with a text editor such as Notepad, and we can read the above information. So the use is quite extensive. Usually a text file is divided into many lines, and as data is stored, there is also the concept of a column. In fact, stored on the hard disk or other media, the contents of the file is stored as a line, the column is a space or Tab interval, the line is the carriage return and line break interval. In the case of a text file that is ANSI encoded (used more), for example, we store the following information:TenThe required space is: 3 rows x 2 characters per line + 2 carriage returns + 2 newline characters = 10 bytes. A text file stores data in a format that has no data type. For example, 10 this data does not specify whether it is an integer or a real or a string. It has a length, which is 2, two bytes. The ASCII code that the computer stores when it is stored: 31h,30h. (in hexadecimal notation). The carriage return is: 0Dh, line break: 0Ah. Therefore, this data storage is like this: 0D 0A 0D 0A (Red for carriage return and line break) 31h 30h is 10,31h 31h is 11,31h 32h is 12. Therefore, we can also consider the text file to be a special binary file.2). binary Files binary files, which are unformatted and have data types. such as the above 10 11 123 number. However, there is no concept of a line in the binary file. We need to store them in a compact way. (You can also add some blank bytes to the middle) from the data type, we first consider the integral type. If the 10 11 12 is treated as an integer of 2 word length. The 10 is represented as: 0Ah 00h. Since the 0Ah corresponds to the decimal 10. And the back 00h is a blank bit. The integer type of 2 word length if insufficient FFh, that is less than 255, you need a blank bit. Similar to: 11 is represented as 0Ch 00h for 0Bh 00h,12. When the integer data exceeds 255, we need 2 bytes to store it. such as 2748 (ABCH), is said to be: BCh 0Ah. To write the low in front (BCH), high-write in the Back (0Ah). When the integer data exceeds 65535, we need 4 bytes to store it. such as 439041101 (1A2B3C4DH), it is represented as: 4Dh 3Ch 2Bh 1Ah. When the data is big, we need 8 bytes of storage. The real data of the binary file also has a byte-length distinction, such as 4-word word, 8-word Word. But the length of the real data not only represents the range of its expression, but also the representation of precision. So, 8 word length We are also called double precision. about how the real data is stored as 2-binary. There are many sets of rules. The IEEE standard floating-point format is now widely used. I'm still learning about the rules, and it's a bit of a hassle. It's not much to say. There is no need to understand it here. Binary files can also store character data, stored in the same way as text files. are stored using ASCII encoding. So when we open some binary files with Notepad, we can see some meaningful strings. (Meaningless garbled we can think of as an integer or a real type, but the Notepad program as a character to explain, thus causing garbled)
III). The benefits of using binary files why use binary files. There are about three reasons: first, binary files are more space-saving, and there is no difference in storing character data. However, when storing numbers, especially real numbers, the binary is more space-saving, such as storing real*4 data: 3.1415927, the text file needs 9 bytes, respectively storage: 3. 1 4 1 5 9 2 7 This 9 ASCII value, and the binary file requires only 4 bytes (DB 0F 49 40) The second reason is that the data that participates in the calculation in memory is stored in binary unformatted format, so it is quicker to use binary to save the file. If stored as a text file, a conversion process is required. When the amount of data is large, there is a noticeable difference in speed between the two. Third, there are some more accurate data, the use of binary storage will not cause the loss of a valid bit.
iv.). How binary files are storedList A binary file as follows:00000000h: 0F, 0F , XX----------... S! EXb54 00000010h: 4 B, 4C 4D 4E 4F ,. Abcdefghigklmnop What is listed here is what you see in UltraEdit (UE). In fact, only the red part is the file content. The front is the line number of the UE join. The following is a reference that the UE tries to interpret as a character type. This file has a total of 32 bytes long. Displayed as two columns, 16 bytes per column. In fact, this is just the display of the UE. Real documents are not branches. Only knowing the contents of this file, if we do not have any explanation, we can not see any useful information. Here's what I'm saying: We think that the first 4 bytes are a 4-byte integer (0F 01 00 6 binary: 10Fh decimal: 271). The 4 bytes After these 4 bytes are another 4-byte integer data (0F 03 00 6 decimal: 30Fh: 783). The subsequent 4 bytes (12 53 21 45) represent a 4-byte real data: 2.5811919E+3. The subsequent 4 bytes (58 62 35 34) represent another 4 bytes of implementation data: 1.6892716E-7. And only 16 bytes later (4 b 4C 4D 4E 4F 50) We think it's a 16-byte string (ABCDEFGHIGKLMNOP) In fact, the binary file simply stores the data, does not specify the data type, For example, the 9th byte to the 16th byte above (12 53 21 45 58 62 35 34), we just thought that 2 bytes of the actual type, in fact, can also be considered as 4 bytes of the character type (S!). EXB54). And then the 16-byte string (ABCDEFGHIGKLMNOP), we can also be considered to be 2 8-byte integer, or 4 4-byte integer, or even 2 bytes of real, 8 4 bytes of the real type, and so on. So, in the face of a binary file, we can't exactly know what it means, we need his description of how the data is stored. This note tells us what type of data the first few bytes to the byte are, and what the stored data means. Otherwise, we can only guess, or powerless.
open (file = ' Testbin.bin ', Access = ' direct ', Form = ' Unformatted ', recl = 4) The above Access represents the direct read method, and the form represents unformatted storage. The more important thing is recl. When we read the data, we use records to describe the unit, and each read-in or write is a record. The length of the record is determined at Open and cannot be changed in the future. If you need to change, you can only Close the Open later. The record length represents a multiple of the read 4-byte length under some compilers, and a specified 4 indicates a record length of 16 bytes. Some compilers directly represent the number of bytes logged, and a 4 indicates a record length of 4 bytes. This issue requires reference to the compiler manual. In the VF series, this value is the previous meaning. You can change the Fortran,data,use Bytes as recl= Unit for unformatted Files by setting the project properties to make it the latter meaning. In command-line mode, this compilation option is used/assume:byterecl. Determining the size of the recl is something we need to do, in general, not suitable for too big or too small. It also needs to be considered in conjunction with the data storage approach. Too small, we need to execute read and write more times, too big, we do not facilitate the operation of small-scale data. Sometimes we even read the data several times, each time the recl is different. For the Testbin.bin file above, it is relatively simple, I have 16 byte length and 8 byte length of two reading to demonstrate that you can even read all 32 bytes at a time: (1) recl = 4 "Record length 16 bytes"
program www_fcode_cn implicit None integer*4:: IVar1, IVar2 real*4:: rVar1, RVar2 Cha Racter (len=16):: CStr Open (File = ' Testbin.bin ', Access = ' Direct ', Form = ' unformatted ', recl = 4) Read (12 , rec = 2) cStr Read (rec = 1) iVar1, IVar2, RVar1, RVar2 write (*, *) CStr Write (*, *) iVar1, IVAR2, RVar1, RVar2 Close (+) End program WWW_FCODE_CN
Recl = 4 (record length is 16 bytes) is specified in the Open. The first read statement reads the second record directly (that is, the 17th byte to the 32nd byte). Read-Out cStr = "Abcdefghigklmnop". The second read statement, which returns to read the first record (that is, the previous 16 bytes). The data is read into 4 4-byte variables respectively. (where the first two are integers, and the latter two are real) the output is: Abcdefghigklmnop 271 783 2581.192 1.6892716E-07 See this result, It means that we have succeeded. At the same time we can see that the first statement, we jump directly to the second record read, and did not read the first article. This is the convenience of reading data directly. Sometimes we don't need some data at all, and when we do, we can jump directly to a record. This record can even be a variable that we've implemented to figure out. For example: Irec = (A + b)/C Read (Rec = irec) cStr Implement the data we have stored for 100 days, we only need the 21st day of Data, what do we do? In sequential reads, we may open an array of 100 elements, or loop through 20 blank reads. But in the direct read, we only need to execute a read (Rec = 21). How convenient it is. (Direct read and sequential read are not directly associated with text files and binaries, but text files are usually read sequentially, while binary files are usually read directly.) This is the nature of their decision. ) (2) recl = 2 "Record length 8 bytes"
Program WWW_FCODE_CN Implicit None integer*4:: iVar1, IVar2 real*4:: rVar1, RVar2 Character (len=16): : CStr Open (File = ' Testbin.bin ', Access = ' Direct ', Form = ' unformatted ', recl = 2) Read (Rec = 4 ) cStr (9:16) Read (rec = 3) cStr (1:8) Read (rec = 1) iVar1, iVar2 read (rec = 2) RV AR1, RVar2 Write (*, *) cStr Write (*, *) iVar1, iVar2, rVar1, rVar2 Close (+) End program Www_fcod E_cn
The recl set here = 2, which means a record of 8 bytes. So we can't read cStr this 16 byte string at a time. We have to read it two times. Reads the 4th record for the first time, placing the second half of the string. Reads the 3rd record for the second time, placing the first half of the string. (You can swap positions). Then read the first record of the two integer variables and the second record of the two real variables. The output is the same as the (1) method. (3) Writing binary files to write binaries also need to consider recl issues. Here we take recl = 4来 for example.
Program WWW_FCODE_CN Implicit None Open (File = ' Testbin.bin ', Access = ' Direct ', Form = ' unformatted ', Re CL = 4) Write (rec = 1) 271, 783, 2581.192_4, 1.6892716E-07 write (rec = 2) "Abcdefghigklmnop"
close () End program WWW_FCODE_CN
Writing binaries and reading binaries is pretty much the same, and I don't have to explain anymore. It should be noted that if you write the nth record directly, and the file does not have only M-Pen records (M < N), then the m+1 to N-1 record will be populated with 0. In other words, the binaries do not break.
Binary file read and write is more flexible, in practical applications, we use which way, we should be based on their own situation to design. How to choose the proper record length recl, how to design the efficient storage method and so on.
Transferred from: http://fcode.cn/content-10-4-1.html
[Reprint:] Fortran binary file read/write