in the study of machine learning combat this book, the compilation of books on the program, there are many errors, now summed up the book on page 66th error
Wordlist=textparse (Open (' Email/ham/%d.txt '%i). Read ()
Error occurred: unicodedecodeerror: ' GBK ' codec can ' t decode byte 0xae in position 199:illegal multibyte sequence
Online Baidu error Reason: [python] View plain copy <pre name= "code" class= "python" > wordlist = Textparse (' email/ham/%d . txt '% i). Read ())
Read file times in Python3 error: unicodedecodeerror: ' GBK ' codec can ' t decode byte 0xae in position 199:illegal multibyte
Most of the information on the Internet is file coding problem, so the UTF-8,GBK,ASICC and so on all kinds of coding methods have tried again, or did not solve the problem.
Then read the wrong message carefully, according to Decode Byte 0xae in position 199, it seems that a byte in the file cannot be decoded, and the problem is that the file contains illegal characters.
Open the file to see, the second line is mixed with "" character, this character is a normal hello. "I don't know why I changed it after I put it in eclipse, and after I deleted it, everything was fine."
but looking for a long time did not find "" character, and Baidu found the solution is as follows:
Solution: Open Email\ham\23.txt, find scifinance, replace it with a space.
But then there was an error:
del (Trainingset[randindex])
TypeError: ' Range ' object doesn ' t support item deletion
Error behavior: Del (Trainingset[randindex])
The reason is that range does not return an array object in Python3, but instead returns a Range object
Change:
Trainingset = range (50); Replace with trainingset = list (range);