In the competition, when big data is encountered, the reading file becomes the bottleneck of the program running speed, and it needs a faster reading method. It is believed that almost all C + + learners have stumbled on the slow speed of the CIN machine, and since then vowed not to read data from CIN. Others say that the speed of Pascal's read statement is less than scanf in C/s and C + + players can only anxious. is C + + really low Pascal? The answer is self-evident. An advanced method is to read the data in a second, and then convert the string, this method is very good legend, but the specific how never tried, so today simply can think of all the way to read the data to test the side, the results are amazing.
The biggest thing about reading data in competitions is reading a whole lot of integers, so I wrote a program that generated 10 million random numbers into data.txt, altogether 55MB. Then I wrote a program to calculate the run time of the backbone, the code is as follows:
#include <ctime> int int //do SOMETHING printf ("%.3lf\n",double(Clock ()-start)/clocks_per_sec);}
The simplest method is to write a loop scanf, the code is as follows:
Const int 10000000 int void Scanf_read () { freopen ("Data.txt","R" for (int i=0 ; i<maxn;i++) scanf ("%d", &numbers[i]);}
But what about efficiency? The test result on my Computer Linux platform was 2.01 seconds. Next is CIN, the code is as follows
const int maxn = 10000000 ; int NUMBERS[MAXN]; void Cin_read () {freopen ( "Data.txt" , Span style= "color: #718C00;" > "R" , stdin); for (int i=STD :: cin >> numbers[i];}
const int maxn = 10000000 ; int NUMBERS[MAXN]; void Cin_read_nosync () {freopen ( "Data.txt" , , stdin); std :: Ios::sync_with_stdio (false ); for (int i=STD :: cin >> numbers[i];}
What is the efficiency after canceling the synchronization? The test run time reduced to 2.05 seconds, and the scanf efficiency is similar ! With this, you can rest assured that CIN and cout have been used.
Next, let's test the process of reading the entire file, first of all to write a string into the function of the array, the code is as follows
const int maxs = 60 *1024 *1024 ; char buf[maxs]; void analyse (char *buf,int len = maxs) {int i; Numbers[i=0 ]=0 ; for (char *p=buf;*p && p-buf<len;p++) if (*p = = ") Numbers[++i]=0 ; else numbers[i] = numbers[i] * 10 + *p- ' 0 ' ;}
The most common way to read an entire file into a string is to use Fread, which is the following code:
const int maxn = 10000000 ; const int Maxs = 60 *1024 * 1024 ; int NUMBERS[MAXN]; char buf[maxs]; void Fread_analyse () {freopen ( "Data.txt" , "RB" , stdin); int len = fread (Buf,1 , Maxs,stdin); Buf[len] = ; Analyse (Buf,len);}
const int maxn = 10000000 ; const int Maxs = 60 *1024 * 1024 ; int NUMBERS[MAXN]; char buf[maxs]; void Read_analyse () {int fd = open ( "Data.txt" , o_rdonly); int len = read (FD,BUF,MAXS); Buf[len] = ; Analyse (Buf,len);}
Test found running time is still 0.29 seconds, see read does not have a special advantage. Is this the end of it? No, I can call Linux's underlying function mmap, the function is to map the file to memory, is the basic method of all read file methods to encapsulate, direct use of mmap? The code is as follows:
const int maxn = 10000000 ; const int Maxs = 60 *1024 * 1024 ; int NUMBERS[MAXN]; char buf[maxs]; void Mmap_analyse () {int fd = open ( "Data.txt" , o_rdonly); int len = Lseek (Fd,0 , Seek_end); char *mbuf = (char *) mmap (null,len,prot_ Read,map_private,fd,0 ); Analyse (Mbuf,len);}
After testing, the operating time was reduced to 0.25 seconds, and the efficiency continued to increase by 14%. So far I have no better way to continue to improve the speed of reading files. How fast is Pascal going to be measured back? The result made the people surprised, actually run for 2.16 seconds of more. The procedure is as follows:
Const MAXN = 10000000;var numbers:array[0..maxn] of Longint; begin Assign (input,' data.txt '); Reset(input); Do read(numbers[i]); End.
To ensure accuracy, I switched to the Windows platform and tested it. The result is the following table:
Method/platform/time (seconds) |
Linux GCC |
Windows MinGW |
Windows VC2008 |
scanf |
2.010 |
3.704 |
3.425 |
Cin |
6.380 |
64.003 |
19.208 |
CIN Cancel Synchronization |
2.050 |
6.004 |
19.616 |
Fread |
0.290 |
0.241 |
0.304 |
Read |
0.290 |
0.398 |
Not supported |
Mmap |
0.250 |
Not supported |
Not supported |
Pascal Read |
2.160 |
4.668 |
|
A few questions can be seen from above.
- Running programs on Linux platforms is generally faster than on Windows.
- Programs compiled under Windows VC generally run faster than MinGW (MINimal Gcc for Windows).
- The VC is not sensitive to CIN canceling synchronization or not, and the efficiency is the same. In turn, the MinGW is very sensitive , 8 times times the efficiency difference.
- Read is a Linux system function, MinGW may have some kind of emulation, and read is slower than fread.
- Pascal program running speed is really not flattering.
Explore the fastest solution for reading files in C + +