Explore the fastest solution for reading files in C + +

Source: Internet
Author: User
Tags fread

In the competition, when big data is encountered, the reading file becomes the bottleneck of the program running speed, and it needs a faster reading method. It is believed that almost all C + + learners have stumbled on the slow speed of the CIN machine, and since then vowed not to read data from CIN. Others say that the speed of Pascal's read statement is less than scanf in C/s and C + + players can only anxious. is C + + really low Pascal? The answer is self-evident. An advanced method is to read the data in a second, and then convert the string, this method is very good legend, but the specific how never tried, so today simply can think of all the way to read the data to test the side, the results are amazing.

The biggest thing about reading data in competitions is reading a whole lot of integers, so I wrote a program that generated 10 million random numbers into data.txt, altogether 55MB. Then I wrote a program to calculate the run time of the backbone, the code is as follows:

#include <ctime> int int //do SOMETHING printf ("%.3lf\n",double(Clock ()-start)/clocks_per_sec);}

The simplest method is to write a loop scanf, the code is as follows:

Const int 10000000 int void Scanf_read () {    freopen ("Data.txt","R" for (int i=0 ; i<maxn;i++)        scanf ("%d", &numbers[i]);}

But what about efficiency? The test result on my Computer Linux platform was 2.01 seconds. Next is CIN, the code is as follows

 const  int  maxn = 10000000 ; int  NUMBERS[MAXN]; void  Cin_read () {freopen ( "Data.txt" , Span style= "color: #718C00;" > "R" , stdin); for  (int  i=STD :: cin  >> numbers[i];} 

 const  int  maxn = 10000000 ; int  NUMBERS[MAXN]; void  Cin_read_nosync () {freopen ( "Data.txt" , , stdin); std :: Ios::sync_with_stdio (false ); for  (int  i=STD :: cin  >> numbers[i];} 

What is the efficiency after canceling the synchronization? The test run time reduced to 2.05 seconds, and the scanf efficiency is similar ! With this, you can rest assured that CIN and cout have been used.

Next, let's test the process of reading the entire file, first of all to write a string into the function of the array, the code is as follows

 const  int  maxs = 60  *1024  *1024 ; char  buf[maxs]; void  analyse (char  *buf,int  len = maxs) {int  i; Numbers[i=0 ]=0 ; for  (char  *p=buf;*p && p-buf<len;p++) if  (*p = =  ") Numbers[++i]=0 ; else  numbers[i] = numbers[i] * 10  + *p- ' 0 ' ;} 

The most common way to read an entire file into a string is to use Fread, which is the following code:

 const  int  maxn = 10000000 ; const  int  Maxs = 60  *1024  * 1024 ; int  NUMBERS[MAXN]; char  buf[maxs]; void  Fread_analyse () {freopen ( "Data.txt"  ,  "RB" , stdin);    int  len = fread (Buf,1 , Maxs,stdin);    Buf[len] = ; Analyse (Buf,len);} 

 const  int  maxn = 10000000 ; const  int  Maxs = 60  *1024  * 1024 ; int  NUMBERS[MAXN]; char  buf[maxs]; void  Read_analyse () {int  fd = open ( "Data.txt" , o_rdonly);    int  len = read (FD,BUF,MAXS);    Buf[len] = ; Analyse (Buf,len);} 

Test found running time is still 0.29 seconds, see read does not have a special advantage. Is this the end of it? No, I can call Linux's underlying function mmap, the function is to map the file to memory, is the basic method of all read file methods to encapsulate, direct use of mmap? The code is as follows:

 const  int  maxn = 10000000 ; const  int  Maxs = 60  *1024  * 1024 ; int  NUMBERS[MAXN]; char  buf[maxs]; void  Mmap_analyse () {int  fd = open ( "Data.txt" , o_rdonly); int  len = Lseek (Fd,0 , Seek_end); char  *mbuf = (char  *) mmap (null,len,prot_        Read,map_private,fd,0 ); Analyse (Mbuf,len);} 

After testing, the operating time was reduced to 0.25 seconds, and the efficiency continued to increase by 14%. So far I have no better way to continue to improve the speed of reading files. How fast is Pascal going to be measured back? The result made the people surprised, actually run for 2.16 seconds of more. The procedure is as follows:

Const    MAXN = 10000000;var    numbers:array[0..maxn] of Longint;     begin Assign (input,' data.txt '); Reset(input); Do  read(numbers[i]); End.

To ensure accuracy, I switched to the Windows platform and tested it. The result is the following table:

Method/platform/time (seconds) Linux GCC Windows MinGW Windows VC2008
scanf 2.010 3.704 3.425
Cin 6.380 64.003 19.208
CIN Cancel Synchronization 2.050 6.004 19.616
Fread 0.290 0.241 0.304
Read 0.290 0.398 Not supported
Mmap 0.250 Not supported Not supported
Pascal Read 2.160 4.668

A few questions can be seen from above.

    1. Running programs on Linux platforms is generally faster than on Windows.
    2. Programs compiled under Windows VC generally run faster than MinGW (MINimal Gcc for Windows).
    3. The VC is not sensitive to CIN canceling synchronization or not, and the efficiency is the same. In turn, the MinGW is very sensitive , 8 times times the efficiency difference.
    4. Read is a Linux system function, MinGW may have some kind of emulation, and read is slower than fread.
    5. Pascal program running speed is really not flattering.

Explore the fastest solution for reading files in C + +

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.