Classic Application of Python Generator

Source: Internet
Author: User
Keywords python generator python generator application

Although the process of creating a Python iterator is powerful, it is often inconvenient to use. In Python, the mechanism of calculating while looping is called a generator.


First explain the scene.

There is a file about 500G, and there is only one line, there is a separator between the lines, we need to read the data in the file line by line,
Then write it to the database.

Some small partners signed up and said that we can use open to get the line, and then use the for loop.

Look at me

with open ("file") as f:
     for i in f.readlines():
        print i
     


Because it has only one line, you will read all the data out of this way, no one can afford 500G memory
, There is no way to do it.

Note that there is a separator between the lines of this sentence, this is our entry point.

First explain a function file.read()
1. This read function does not read all at once, you can pass int parameter, which represents the number of characters read
2. Continuous call, you can read the offset value.
With this, our problem will be solved.
Examples are as follows:

file_phth="C:/Users/PycharmProjects/test1/test.txt"

with open(file_phth,"r") as f:
    a=f.read(20)
    b=f.read(20)
    print(a,b)


Print results:

Ten ,wang
i lov e you, tu ran hao


If there is this function, we can read big data. Look at the example I wrote below:

file_phth="C:/Users//PycharmProjects/test1/test.txt"

def Myread(f,newline):
    bug="" #Temporarily store the read data
    while True:
        while newline in bug: #Determine whether the separator is temporarily storing data
            pos=bug.index(newline) #Use the index method and return the index of the separator
            yield bug[:pos] #take the value before the separator and save it in the generator
            bug=bug[pos+len(newline):] # Also update the bug after taking the value, delete the previous value plus the separator
        chunk=f.read(200) #200 characters at a time
        if not chunk: #If you can't get the value, use this to end the loop
            yield bug #The value after the last separator is also saved in the generator
            break
        bug=bug+chunk #The value after the last separator in the 200 character plus the chunk value obtained again
with open(file_phth,"r") as f:

    for i in Myread(f, newline="{|}"):
        print(i)

Let me explain the workflow first. This is a classic example and works perfectly.

The while loop refers to the value after the last separator in the 200 characters obtained, plus the chunk value obtained again.
Until all the values before the separators are taken.

The purpose of the if statement is:
When the chunk can't get the value, that is, the boundary of the file content, you must end the loop and yield the value after the last separator again.

The last for loop traverses the value of the generator, and the obtained value can be directly inserted into the database.

Idea: If you encounter large files, you can't directly put them in memory, you need to read them in sections to reduce memory usage
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.