How to get JSON data from a Web page

Source: Internet
Author: User

When Python crawls the picture, sometimes can not find the corresponding URL, there may be JSON, so how to parse JSON data with Python, small white looked at a few forums to summarize some of their own to deepen the impression.

1.requests.get (url,params) GET request data

ImportRequestsdefget_many_pages (keyword, page): params=[] #收集不同页面的json数据 forIinchRange (30*page, 30): #动态加载, 30 params.append per page ({'TN':'resultjson_com',        'IPN':'RJ',        'CT': 201326592,        ' is':"',        'FP':'result',        'Queryword': keyword,'CL': 2,        'LM':-1,        'IE':'Utf-8',        'OE':'Utf-8',        'Adpicid':"',        'St':-1,        'Z':"',        'IC':"',        'Word': keyword,'s':"',        'SE':"',        'Tab':"',        'width':"',        'Height':"',        ' Face': 0,'Istype': 2,        'QC':"',        'NC':"' ,        'FR':"',        'PN': I,'RN': 30,        'GSM':'1e',        '1517048369666':"'}) #json的Query String paramters is dynamic Json_url='Https://image.baidu.com/search/acjson' #json的init地址Json_datas=[] #用于收集所有页面的json数据 forParaminchparams: #分别取出每个动态的参数, is a dictionary form res= Requests.get (Json_url, params =param) #获取json地址 res.encoding='Utf-8' #转化为utf-8 formatJson_data= Res.json (). Get ('Data'#解析json数据成字典, the Get method is used to find the value in Data json_datas.append (json_data) #把所有页的json数据取回

returnJson_datasdefGet_url (): Json_datas=datalist# get JSON data for all pages#print (Json_datas) forEach_datainchJson_datas: #解开列表嵌套 forEach_dictincheach_data: #解开列嵌套直到出现字典 each_url= Each_dict.get ('Thumburl') #获取字典中的地址Print(Each_url) DataList= Get_many_pages ('Riot-Diffuse expression pack', 3) Get_url ()


1. Urllib.request + JSON GET request data

#-*-coding:utf-8-*-" "Created on Sat Jan 22:39:15 2018 @author: Zhuxueming" "Importurllib.requestImportJSONdefget_many_pages (page): Json_datas= []   forIinchRange (30,30*page,30):#This is because there are multiple% in the URL. Format cannot be formatted with%, according to the JSON address found, only 1517056200441 = The subsequent number changes in different pages, so the individual change this one can beJson_url ='Http://image.baidu.com/search/acjson?tn=resultjson_com&ipn=rj&ct=201326592&is=&fp=result &queryWord=%E6%9A%B4%E6%BC%AB%E8%A1%A8%E6%83%85%E5%8C%85&cl=2&lm=-1&ie=utf-8&oe=utf-8& adpicid=&st=&z=&ic=&word=%e6%9a%b4%e6%bc%ab%e8%a1%a8%e6%83%85%e5%8c%85&s=&se=&tab= &width=&height=&face=&istype=&qc=&nc=&fr=&pn={0}&rn=30&gsm=3c& 1517056200441='. Format (i) Res= Urllib.request.urlopen (Json_url)#Get URL Datahtml = Res.read (). Decode ('Utf-8')#read data and convert to Utf-8Json_data = json.loads (HTML). Get ('Data')#convert data from JSON into a dictionaryJson_datas.append (Json_data)#merging data from different pages  returnJson_datasdefGet_url (): Json_datas= DataList#get JSON data for all pages  #print (Json_datas)   forEach_datainchJson_datas:#Undo List Nesting     forEach_dictinchEach_data:#undo Column Nesting until dictionary appearsEach_url = Each_dict.get ('Thumburl')#get the address in the dictionary      Print(Each_url) DataList= Get_many_pages (3) Get_url ()

All in all two ways can be used, but the second cannot be directly keyword search, but the different key needs to modify the JSON parameters, so it is no harm, the main is to find this dynamic JSON packet is more difficult, generally in the JS annealed xhr below.

How to get JSON data from a Web page

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.