1. The website address http://www.baicmotor.com/dealer.php
2. After using firefox to view the information, I found that the website does not use json data, but is simply an html page.
3. Use pyquery in the PyQuery library for html Parsing
Page Style:
Code:
css_select = page = page = page.replace(, page = page.replace(, d = dealer_list = dealer_div p = dealer_div.findall( dealer = len(p)==1 len(p)==6 strp = dealer[Constant.CITY] = p[1 strc = p[2 dealer[Constant.PROVINCE] = dealer[Constant.CITY] = p[1 dealer[Constant.NAME] = p[2 dealer[Constant.ADDRESSTYPE] = p[3 dealer[Constant.ADDRESS] = p[4 dealer[Constant.TELPHONE] = p[5 len(p)==5 p[0].text.strip() != u dealer[Constant.PROVINCE] = dealer[Constant.CITY] = dealer[Constant.NAME] = p[1 dealer[Constant.ADDRESSTYPE] = p[2 dealer[Constant.ADDRESS] = p[3 dealer[Constant.TELPHONE] = p[4 len(p)==3 self.saver.commit()
4. The final code is successfully executed. The corresponding data is obtained and saved to excel.