Problem: nodejs cannot process non-utf8 characters at present
In the cnodejs user group, some people encountered code problems when capturing Baidu pages.
In buffer. tostring (encoding), encoding only supports utf8 encoding. Therefore, an additional module is required to handle this problem.
Solution: node-iconv Module
Installation:
$ NPM install iconv
Example
VaR HTTP = require ('http'); var Options = {Host: 'www .baidu.com ', Port: 80, path:'/s? WD = nodejs '}; var iconv = require ('iconv '). iconv; http. get (options, function (RES) {console. log ("got response:" + Res. statuscode, Res. headers); var buffers = [], size = 0; Res. on ('data', function (buffer) {buffers. push (buffer); size + = buffer. length ;}); Res. on ('end', function () {var buffer = new buffer (size), Pos = 0; For (VAR I = 0, L = buffers. length; I
The specific page encoding can be determined based on res. headers ['content-type.
If res. headers ['content-type'] is not available, you need to analyze the Content-Type of HTML to determine charset.
"{Meta http-equiv =" Content-Type "content =" text/html; charset = XXXX "/}"
For more URL-related requests, use the urllib library.
Love
Hope this article will be useful to you ^_^