Correct code:
Import Urllib.requesturl = "71025372?utm_source=itdadao&utm_medium=referral"; response = Urllib.request.Request ( Url=url,method= "GET"), result = Urllib.request.urlopen (response), HTML = Result.read (). Decode ("UTF8"); f = open ("x.html "," W ", encoding=" UTF8 "); F.write (HTML); F.close ();
1:urllib.request.urlopen (URL). read (); Read content defaults to bytes format
2:open (Filename,open,encode); Open File
Error message:
Unicodeencodeerror: ' GBK ' codec can ' t encode character ' \xa0 ' in position 23869:illegal multibyte sequence
How to resolve:
f = open ("X.html", "W", encoding= "UTF8"); Specifies the encoding of the open file or F = open ("x.html", "WB") opening the file for writing to the binary
Try to maintain all encoding consistency when working with files
Python writes to file unicodeencodeerror: ' GBK ' codec can ' t encode character ' \xa0 ' in position 23869: