The data we used previously was defined on the console by ourselves. When processing data in Python, we usually need to sort out more file data, therefore, it is necessary to read a file. Let's take a look at how to read a file using Python.
First, prepare a file. We will read the file later. Here I put the file under D: \ python \ file with the file name: sketch.txt. The file content is as follows:
Man: Is this the right room for an argument?Other Man: I've told you once.Man: No you haven't!Other Man: Yes I have.Man: When?Other Man: Just now.Man: No you didn't!Other Man: Yes I did!Man: You didn't!Other Man: I'm telling you, I did!Man: You did not!Other Man: Oh I'm sorry, is this a five minute argument, or the full half hour?Man: Ah! (taking out his wallet and paying) Just the five minutes.Other Man: Just the five minutes. Thank you.Other Man: Anyway, I did.Man: You most certainly did not!Other Man: Now let's get one thing quite clear: I most definitely told you!Man: Oh no you didn't!Other Man: Oh yes I did!Man: Oh no you didn't!Other Man: Oh yes I did!Man: Oh look, this isn't an argument!(pause)Other Man: Yes it is!Man: No it isn't!(pause)Man: It's just contradiction!Other Man: No it isn't!Man: It IS!Other Man: It is NOT!Man: You just contradicted me!Other Man: No I didn't!Man: You DID!Other Man: No no no!Man: You did just then!Other Man: Nonsense!Man: (exasperated) Oh, this is futile!!(pause)Other Man: No it isn't!Man: Yes it is!
Read files
The first task we need to do is to locate the python shell to the folder where our file is located, so that we can find our file and locate it using the following command:
>>> import os>>> os.getcwd()'D:\\Python33'>>> os.chdir('d:\\python\\file')>>> os.getcwd()'d:\\python\\file'>>>
Now, we have imported the OS package first, and then passed the OS. locate the chdir () method to the location we need, and then confirm it. Here I locate the d: \ python \ file folder.
Then we can try to read the content of the file. Here we first read a row, as shown below:
>>> data = open('sketch.txt')>>> print(data.readline(),end='')Man: Is this the right room for an argument?>>> print(data.readline(),end='')Other Man: I've told you once.>>>
Here we try to read the two lines and provide some other methods in python, such as returning to the beginning of the file, as follows:
>>> data.seek(0)0>>>
Then we can output all the content in the file through the for Loop:
>>> for each_line in data:print(each_line,end='')Man: Is this the right room for an argument?Other Man: I've told you once.Man: No you haven't!Other Man: Yes I have.Man: When?Other Man: Just now.Man: No you didn't!Other Man: Yes I did!Man: You didn't!Other Man: I'm telling you, I did!Man: You did not!Other Man: Oh I'm sorry, is this a five minute argument, or the full half hour?Man: Ah! (taking out his wallet and paying) Just the five minutes.Other Man: Just the five minutes. Thank you.Other Man: Anyway, I did.Man: You most certainly did not!Other Man: Now let's get one thing quite clear: I most definitely told you!Man: Oh no you didn't!Other Man: Oh yes I did!Man: Oh no you didn't!Other Man: Oh yes I did!Man: Oh look, this isn't an argument!(pause)Other Man: Yes it is!Man: No it isn't!(pause)Man: It's just contradiction!Other Man: No it isn't!Man: It IS!Other Man: It is NOT!Man: You just contradicted me!Other Man: No I didn't!Man: You DID!Other Man: No no no!Man: You did just then!Other Man: Nonsense!Man: (exasperated) Oh, this is futile!!(pause)Other Man: No it isn't!Man: Yes it is!>>>
In this way, all the content in the file is read and processed. Finally, don't forget to close the currently opened file:
>>> data.close()>>>
Data Processing
After reading the data, we can simply process the data. We can see that the above data is a conversation between two people. The format is as follows:
Man: Is this the right room for an argument?
Each person has a colon before talking about the content. We can use this colon to split the speech content of the Speaker and the split function is: split, the Code is as follows:
each_line.split(":")
Split a sentence into two parts based on the colon. We use a list to save it, as shown below:
(role,line_spoke) = each_line.split(":")
In this way, the information of the corresponding person is saved to role, and the speech content is saved to line_spoke. We rewrite the above for loop, as shown below:
>>> data = open("sketch.txt")>>> for each_line in data:(role,line_spoke) = each_line.split(":")print(role,end='')print(' said: ',end='')print(line_spoke,end='')Man said: Is this the right room for an argument?Other Man said: I've told you once.Man said: No you haven't!Other Man said: Yes I have.Man said: When?Other Man said: Just now.Man said: No you didn't!Other Man said: Yes I did!Man said: You didn't!Other Man said: I'm telling you, I did!Man said: You did not!Other Man said: Oh I'm sorry, is this a five minute argument, or the full half hour?Man said: Ah! (taking out his wallet and paying) Just the five minutes.Other Man said: Just the five minutes. Thank you.Other Man said: Anyway, I did.Man said: You most certainly did not!Traceback (most recent call last): File "
", line 2, in
(role,line_spoke) = each_line.split(":")ValueError: too many values to unpack (expected 2)>>>
Here, we can see that an error occurred while reading the data. The error is that You are reading Man said: You most certainly did not! The next line appears. Let's take a look at what the data in this line is, as shown below:
Other Man: Now let's get one thing quite clear: I most definitely told you!
In this way, we can see that there are multiple colons in this sentence, and we split the sentence according to the colon, so that we can split the sentence into three parts, the defined list has only two variables. Therefore, the too partition values to unpack (expected 2) error is reported. How can we modify it?
What we want is to split according to the first colon that appears after a person, but how to represent it? Here, we will first check the official documentation of split to see if there are any corresponding solutions:
>>> help(each_line.split)Help on built-in function split:split(...) S.split(sep=None, maxsplit=-1) -> list of strings Return a list of the words in S, using sep as the delimiter string. If maxsplit is given, at most maxsplit splits are done. If sep is not specified or is None, any whitespace string is a separator and empty strings are removed from the result.>>>
Here we can see that there is a maxsplit parameter, which can be used to split the string to several portions. Here we set it to 1 and split the string into two portions, the above output code is modified as follows:
data = open('sketch.txt')for each_line in data: (role,line_spoke) = each_line.split(":",1) print(role,end='') print(' said: ',end='') print(line_spoke,end='')data.close()
Run the following command:
>>> ================================ RESTART ================================>>> Man said: Is this the right room for an argument?Other Man said: I've told you once.Man said: No you haven't!Other Man said: Yes I have.Man said: When?Other Man said: Just now.Man said: No you didn't!Other Man said: Yes I did!Man said: You didn't!Other Man said: I'm telling you, I did!Man said: You did not!Other Man said: Oh I'm sorry, is this a five minute argument, or the full half hour?Man said: Ah! (taking out his wallet and paying) Just the five minutes.Other Man said: Just the five minutes. Thank you.Other Man said: Anyway, I did.Man said: You most certainly did not!Other Man said: Now let's get one thing quite clear: I most definitely told you!Man said: Oh no you didn't!Other Man said: Oh yes I did!Man said: Oh no you didn't!Other Man said: Oh yes I did!Man said: Oh look, this isn't an argument!Traceback (most recent call last): File "D:\python\file\sketch.py", line 4, in
(role,line_spoke) = each_line.split(":",1)ValueError: need more than 1 value to unpack>>>
This time, an error occurs again, but it is not the previous one. this error occurs in Man said: Oh look, this isn't an argument! After this sentence, we can find the sentence:
(pause)
We can see that there is no colon in this sentence, but we need to split it with a colon, so an error occurs, therefore, before splitting, we should first check whether there is a colon in the current row. We can find whether there is a colon in the current row according to a function called find provided by python, the usage is as follows:
>>> each_line = "Hello World">>> each_line.find(":")-1>>> each_line = "Man:Hello!">>> each_line.find(":")3>>>
If the colon is not found,-1 is returned. If yes, the position of the colon is returned. In this way, we can modify the previous Code as follows:
data = open('sketch.txt')for each_line in data: if not each_line.find(":")==-1: (role,line_spoke) = each_line.split(":",1) print(role,end='') print(' said: ',end='') print(line_spoke,end='')data.close()
When we judge in if, we add not to perform the inverse operation. The running result is as follows:
>>> ================================ RESTART ================================>>> Man said: Is this the right room for an argument?Other Man said: I've told you once.Man said: No you haven't!Other Man said: Yes I have.Man said: When?Other Man said: Just now.Man said: No you didn't!Other Man said: Yes I did!Man said: You didn't!Other Man said: I'm telling you, I did!Man said: You did not!Other Man said: Oh I'm sorry, is this a five minute argument, or the full half hour?Man said: Ah! (taking out his wallet and paying) Just the five minutes.Other Man said: Just the five minutes. Thank you.Other Man said: Anyway, I did.Man said: You most certainly did not!Other Man said: Now let's get one thing quite clear: I most definitely told you!Man said: Oh no you didn't!Other Man said: Oh yes I did!Man said: Oh no you didn't!Other Man said: Oh yes I did!Man said: Oh look, this isn't an argument!Other Man said: Yes it is!Man said: No it isn't!Man said: It's just contradiction!Other Man said: No it isn't!Man said: It IS!Other Man said: It is NOT!Man said: You just contradicted me!Other Man said: No I didn't!Man said: You DID!Other Man said: No no no!Man said: You did just then!Other Man said: Nonsense!Man said: (exasperated) Oh, this is futile!!Other Man said: No it isn't!Man said: Yes it is!>>>
We can see that all data is displayed according to our rules.