Golang read large file performance comparison by line

Source: Internet
Author: User
This is a creation in Article, where the information may have evolved or changed.

Objective

What is Bufio?
The Bufio is a dedicated clock network that is used to drive the I/O column, a dedicated clock network independent of the global clock resource and suitable for capturing source synchronization data. Bufio can only be driven by clock-capable I/O located in the same 1: zone. A clock area has 4 Burio, 2 of which can drive an I/O clock network for adjacent areas. Bufio cannot drive logical resources (CLB, Bram, and so on) because I/O clock networks exist only in I/O columns.

The simple point is:

    • The Bufio package implements the I/O operation with caching

    • It encapsulates an IO. Reader or IO. Writer Object

    • Make it a cache and some text read and write functions

This paper mainly compares the performance of ReadString and ReadLine functions in Bufio.

Note: Test code ignores read content and error handling

ReadString function

ReadString Code:

func ReadString(filename string) {    f, _ := os.Open(filename)    defer f.Close()    r := bufio.NewReader(f)    for {        _, err := r.ReadString('\n')         if err != nil {            break        }    }}

ReadLine function

ReadLine Code:

func ReadLine(filename string) {    f, _ := os.Open(filename)    defer f.Close()    r := bufio.NewReader(f)    for {        _, err := readLine(r)        if err != nil {            break        }    }}

This function primarily addresses cases where the number of single-line bytes is greater than 4096

func readLine(r *bufio.Reader) (string, error) {    line, isprefix, err := r.ReadLine()    for isprefix && err == nil {        var bs []byte        bs, isprefix, err = r.ReadLine()        line = append(line, bs...)    }    return string(line), err}

Note: The test file log is greater than 4096 bytes per line

Performance comparison

The time taken to read the 10g/20g/30g files in the two ways is as follows:

Time to read 10G files

readstring:30.717832767sreadline:27.358268244s

Time to read 20G files

readstring:59.937901346sreadline:54.871384854s

* * Read 30G file time ******

readstring:1m21.657831495sreadline:1m13.222376352s

Conclusion

ReadLine reads files faster because the ReadString backend calls Readbytes, and readbytes multiple times using the Copy method creates a lot of time.

The test code is as follows:

package mainimport (    "bufio"    "fmt"    "os"    "time")
func main() {    filename := "./log"    s := time.Now()    ReadString(filename)    e1 := time.Now()    fmt.Printf("readstring:%v\n", e1.Sub(s))    ReadLine(filename)    e2 := time.Now()    fmt.Printf("readline:%v\n", e2.Sub(e1))}
func ReadString(filename string) {    f, _ := os.Open(filename)    defer f.Close()    r := bufio.NewReader(f)    for {        _, err := r.ReadString('\n') //忽略内容        if err != nil {            break        }    }}
func ReadLine(filename string) {    f, _ := os.Open(filename)    defer f.Close()    r := bufio.NewReader(f)    for {        _, err := readLine(r)        if err != nil {            break        }    }}
func readLine(r *bufio.Reader) (string, error) {    line, isprefix, err := r.ReadLine()    for isprefix && err == nil {        var bs []byte        bs, isprefix, err = r.ReadLine()        line = append(line, bs...)    }    return string(line), err}

Technical Exchange QQ Group: 368573673

Interested to pay attention to our public number: Reboot51

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.