This case may not be too smart, a friend and I said their company let him climb Ctrip's hotel price information, I went to see a bit, found that Ctrip's information is very troublesome to climb, the city is a must lose, the hotel is the name of the election, the URL of the jump to the city behind a number, for this each city represents the number of rules I , so I can only be directed to climb a city, or analog browser, I feel very troublesome, to the hotel page and there are a lot of things look at the headache, I said to him this is very troublesome, analysis spent a long time, he said that their company is manual input hotel price details of the URL to the database, Then get the price data directly from a page
#coding =utf-8import sysreload (SYS) sys.setdefaultencoding ("Utf-8") Import urllibfrom Selenium import webdriverurls=[' Http://hotels.ctrip.com/hotel/848702.html#ctm_ref=hod_sr_lst_dl_n_2_1 ']# assumes a bunch of Urlclass Xc (): def PC (Seif): For URL in urls: driver = webdriver. PHANTOMJS () driver.get (URL) fangx_1=driver.find_element_by_class_name (' Room_unfold '). Text.split (' \ n ') [ 0] jiage_1=driver.find_element_by_class_name (' Base_price '). Text driver.quit return fangx_1+ ' | ' +jiage_1 # Room type and corresponding price S=xc () print s.pc ()
Results: Single room (no window) |¥237
The above code is just a simple example, and all the room type price needs an analysis, too troublesome, and then I found the source code at the bottom of a JSON, the content is the room type, price these, so I changed the code
#coding =utf-8import sysreload (SYS) sys.setdefaultencoding ("Utf-8") Import urllibfrom Selenium import webdriverurls=[' Http://hotels.ctrip.com/hotel/848702.html#ctm_ref=hod_sr_lst_dl_n_2_1 ']class Xc (): def PC (Seif): for URL In URLs: driver = webdriver. PHANTOMJS () driver.get (URL) #fangx_1 =driver.find_element_by_class_name (' Room_unfold '). Text.split (' \ n ') [0] #jiage_1 =driver.find_element_by_class_name (' Base_price '). Text json=driver.find_element_by_xpath ('//*[@id = " Htl_detail_htl_hotel "]). Get_attribute (' value ') driver.quit #return fangx_1+ ' | ' +jiage_1 return jsons=xc () print s.pc ()
Results:
pageid=102003;ht=848702;checkin=2016-05-09;checkout=2016-05-10;rmlist=[{"rm": "30665921", "Shadowid": "0", "RPFQ": "0.0", "RPFH": "219", "PT": "FG", "MT": "0.0", "PN": "0.0", "Promotiontype": "0", "iscomfirm": "F", "Bedtype": "Big Bed", " Breakfast ":" 0 "," policy ":" Free Cancellation "," Guaranteetype ":" F "," BK ":" T "," Isgift ":" F "," Isgroup ":" F "},{" rm ":" 30265080 "," Shadowid ":" 0 "," RPFQ ":" 0.0 "," RPFH ":" 263 "," PT ":" FG "," MT ":" 0.0 "," PN ":" 0.0 "," Promotiontype ":" 0 "," iscomfirm ":" F "," Bedtype ":" Big Bed "," breakfast ":" 0 "," policy ":" Non-cancellation "," Guaranteetype ":" T "," BK ":" T "," Isgift ":" F "," Isgroup ":" F "},{" rm ":" 24125027 "," Shadowid ":" 0 "," RPFQ ":" 0.0 "," RPFH ":" 294 "," PT ":" FG "," MT ":" 0.0 "," PN ":" 0.0 "," Promotiontype ":" 0 "," Iscomfirm ":" F "," Bedtype ":" Queen bed "," breakfast ":" 0 "," policy ":" Free Cancellation "," Guaranteetype ":" F "," BK ":" T "," Isgift ":" F "," Isgroup ":" F "},{" rm ":" 8684722 "," Shadowid ":" 0 "," RPFQ ":" 0.0 "," RPFH ":" 294 "," PT ":" FG "," MT ":" 0.0 "," PN ":" 0.0 "," Promotiontype ":" 0 "," iscomfirm ":" F "," Bedtype ":" Big Bed "," breakfast ":" 0 "," policy ":" Non-cancellation "," Guaranteetype ":" T "," BK ":" F " , "Isgift": "F", "Isgroup": "F"},{"rm": "30265081 "," Shadowid ":" 0 "," RPFQ ":" 0.0 "," RPFH ":" 219 "," PT ":" FG "," MT ":" 0.0 "," PN ":" 0.0 "," Promotiontype ":" 0 "," iscomfirm ":" F "," Bedtype ":" Twin "," Breakfast ":" 0 "," policy ":" Non-cancellation "," Guaranteetype ":" T "," BK ":" T "," Isgift ":" F "," Isgroup ":" F "}, {"rm": "8684723", "Shadowid": "0", "RPFQ": "0.0", "RPFH": "294", "PT": "FG", "MT": "0.0", "PN": "0.0", "Promotiontype": "0", " Iscomfirm ":" F "," Bedtype ":" Twin "," Breakfast ":" 0 "," policy ":" Non-cancellation "," Guaranteetype ":" T "," BK ":" F "," Isgift ":" F "," Isgroup ":" F "},{" rm ":" 30265075 "," Shadowid ":" 0 "," RPFQ ":" 0.0 "," RPFH ":" 237 "," PT ":" FG "," MT ":" 0.0 "," PN ":" 0.0 "," Promotiontype ":" 0 "," iscomfirm ":" F "," Bedtype ":" Single Bed "," breakfast ":" 0 "," policy ":" Non-cancellation "," Guaranteetype ":" T "," BK ":" T "," Isgift ":" F "," Isgroup ":" F "},{" rm ":" 24125024 "," Shadowid ":" 0 "," RPFQ ":" 0.0 "," RPFH ":" 265 "," PT ":" FG "," MT ":" 0.0 "," PN ":" 0.0 "," Promotiontype ":" 0 "," iscomfirm ":" F "," Bedtype ":" Single Bed "," breakfast ":" 0 "," policy ":" Free Cancellation "," Guaranteetype ": "F", "BK": "T", "Isgift": "F", "Isgroup": "F"},{"rm": "2890470", "Shadowid": "0", "RPFQ": "0.0", "RPFH": "265", "PT": "FG", " Mt ":" 0.0 "," PN ":" 0.0 ","Promotiontype": "0", "iscomfirm": "F", "Bedtype": "Single Bed", "breakfast": "0", "policy": "Non-cancellation", "Guaranteetype": "T", "BK": " F "," Isgift ":" F "," Isgroup ":" F "},{" rm ":" 30265074 "," Shadowid ":" 0 "," RPFQ ":" 0.0 "," RPFH ":" 254 "," PT ":" FG "," MT ":" 0.0 ", "PN": "0.0", "Promotiontype": "0", "iscomfirm": "F", "Bedtype": "Big Bed", "breakfast": "0", "policy": "Non-cancellation", "Guaranteetype": "T", "BK": "T", "Isgift": "F", "Isgroup": "F"},{"rm": "24125041", "Shadowid": "0", "RPFQ": "0.0", "RPFH": "284", "PT": "FG", " Mt ":" 0.0 "," PN ":" 0.0 "," Promotiontype ":" 0 "," iscomfirm ":" F "," Bedtype ":" Big Bed "," breakfast ":" 0 "," policy ":" Free cancellation "," Guaranteetype ":" F "," BK ":" T "," Isgift ":" F "," Isgroup ":" F "},{" rm ":" 2890480 "," Shadowid ":" 0 "," RPFQ ":" 0.0 "," RPFH ":" 284 "," PT ":" FG "," MT ":" 0.0 "," PN ":" 0.0 "," Promotiontype ":" 0 "," iscomfirm ":" F "," Bedtype ":" Big Bed "," breakfast ":" 0 "," Policy ":" Cannot be canceled "," Guaranteetype ":" T "," BK ":" F "," Isgift ":" F "," Isgroup ":" F "},{" rm ":" 30265072 "," Shadowid ":" 0 "," RPFQ " : "0.0", "RPFH": "280", "PT": "FG", "MT": "0.0", "PN": "0.0", "Promotiontype": "0", "iscomfirm": "F", "Bedtype": "Twin Bed", " Breakfast ":" 0 "," policy ":" Non-fetching"Guaranteetype": "T", "BK": "T", "Isgift": "F", "Isgroup": "F"},{"rm": "24125016", "Shadowid": "0", "RPFQ": "0.0", "RPFH ":" 313 "," PT ":" FG "," MT ":" 0.0 "," PN ":" 0.0 "," Promotiontype ":" 0 "," iscomfirm ":" F "," Bedtype ":" Twin Bed "," breakfast ":" 0 "," Policy ":" Free Cancellation "," Guaranteetype ":" F "," BK ":" T "," Isgift ":" F "," Isgroup ":" F "},{" rm ":" 2525661 "," Shadowid ":" 0 "," RPFQ ": "0.0", "RPFH": "313", "PT": "FG", "MT": "0.0", "PN": "0.0", "Promotiontype": "0", "iscomfirm": "F", "Bedtype": "Twin Bed", " Breakfast ":" 0 "," policy ":" Non-cancellation "," Guaranteetype ":" T "," BK ":" F "," Isgift ":" F "," Isgroup ":" F "},{" rm ":" 30265079 "," Shadowid ":" 0 "," RPFQ ":" 0.0 "," RPFH ":" 305 "," PT ":" FG "," MT ":" 0.0 "," PN ":" 0.0 "," Promotiontype ":" 0 "," iscomfirm ":" F "," Bedtype ":" Big Bed "," breakfast ":" 0 "," policy ":" Non-cancellation "," Guaranteetype ":" T "," BK ":" T "," Isgift ":" F "," Isgroup ":" F "},{" rm ":" 2525665 "," Shadowid ":" 0 "," RPFQ ":" 0.0 "," RPFH ":" 341 "," PT ":" FG "," MT ":" 0.0 "," PN ":" 0.0 "," Promotiontype ":" 0 "," Iscomfirm ":" F "," Bedtype ":" Queen bed "," breakfast ":" 0 "," policy ":" Non-cancellation "," Guaranteetype ":" T "," BK ":" F "," Isgift ":" F "," Isgroup ":" F "},{" rm ":" 30265077 "," ShadOwid ":" 0 "," RPFQ ":" 0.0 "," RPFH ":" 305 "," PT ":" FG "," MT ":" 0.0 "," PN ":" 0.0 "," Promotiontype ":" 0 "," iscomfirm ":" F "," Bedtype ":" Twin "," Breakfast ":" 0 "," policy ":" Non-cancellation "," Guaranteetype ":" T "," BK ":" T "," Isgift ":" F "," Isgroup ":" F "},{" rm ":" 24125021 "," Shadowid ":" 0 "," RPFQ ":" 0.0 "," RPFH ":" 341 "," PT ":" FG "," MT ":" 0.0 "," PN ":" 0.0 "," Promotiontype ":" 0 "," Iscomfirm ":" F "," Bedtype ":" Twin "," Breakfast ":" 0 "," policy ":" Free Cancellation "," Guaranteetype ":" F "," BK ":" T "," Isgift ":" F "," Isgroup ":" F "},{" rm ":" 8684720 "," Shadowid ":" 0 "," RPFQ ":" 0.0 "," RPFH ":" 341 "," PT ":" FG "," MT ":" 0.0 "," PN ":" 0.0 "," Promotiontype ":" 0 "," iscomfirm ":" F "," Bedtype ":" Twin "," Breakfast ":" 0 "," policy ":" Non-cancellation "," Guaranteetype ":" T "," BK ":" T " , "Isgift": "F", "Isgroup": "F"}]
RPFH is the priceBedtype is a room type
Python crawler: Case two: Ctrip Hotel Price Information