Recently in the Crawl Activity Tree site (http://www.huodongshu.com/html/find.html) on the data found that in the search box after entering Chinese, click Search, Phantomjs crawl data can not be crawled, but with IE driver will be able to find, Only later did I find out why.
For example, url:http://www.huodongshu.com/html/find_search.html?search_keyword= numbers, Phantomjs crawled in-memory URLs into HTTP/ www.huodongshu.com/html/find_search.html?search_keyword=, the result of the search is 0, that is, no search.
In the search box to enter the English is no problem, the strange input of Chinese will become??, and later on the Activity Line (http://www.huodongxing.com/) website directly entered the number, became%e6%95%b0%e5%ad%97
Later I think if the corresponding Chinese conversion to%e6%95%b0%e5%ad%97 such a code, PHANTOMJS can find it, such as:
Url= ' http://www.huodongshu.com/html/find_search.html?search_keyword=%E6%95%B0%E5%AD%97 ', results a test can be found, Therefore, in the use of PHANTOMJS crawl data, the search for Chinese keywords to the URL code to solve the problem.
There are two specific methods:
One
Import Urllib
s = ' number '
Print Urllib.quote (s)
The result is:%e6%95%b0%e5%ad%97
Xi. [python crawler]: Selenium +phantomjs crawl activity tree for meeting events