How to access Sina Weibo data

Source: Internet
Author: User

Whether it is related to micro-blog research or the development of related applications, may need to obtain historical or real-time data. How to get it. In addition to Sina Weibo provides developers with APIs, you can also use the search function (see this article) to collect data.


In terms of historical data acquisition, the search interface is weaker than Twitter, but it provides a search function.


In real-time data acquisition, Sina is still relatively conservative. Related to the three interfaces with Public_timeline, topics, Nearby_timeline, respectively, to collect the public real-time micro-blog, a topic of real-time microblogging, a point around the real-time microblogging. It can be seen that there is a lack of a real-time search interface for a keyword at a certain point. There are a number of limitations, but there are alternatives: Use the search function to collect microblogging one hours ago, and to limit keywords and locations, and so on. The following two aspects of historical and real-time data are described to obtain micro-blog data.
data Collection Ideas

Collect data with key words

Historical data are important in scientific research, especially in the direction of social media. But the historical data also has the request, first wants the topic correlation, then the history also has the specific time period. With these two requirements in mind, you can use the advanced search interface for Weibo (see http://s.weibo.com/).


The main ideas are: Crawl Web pages, parse Web pages, store useful information.


It is worth noting that we can choose to parse all the information on the Web page, including some information about the tweet and the user, but the information is relatively small. Consider simply extracting the ID of the microblog from the Web page and then using the API to return all the information, including the user information, to the microblog.


Collect real time/historical GPS data

The main role of real-time data can be embodied in some applications, such as real-time monitoring of public response to emergencies.


The main idea is: a path to real-time data, using the APIs provided by the microblogging search. Here is not the table, call the API can be. The other is to use the search function mentioned above to collect near real-time data one hours ago.


Problems encountered

Microblogging API access rights, frequency issues

That is, there are limits to the number of accesses to the API per hour. Sina Weibo API uses OAUTH2 to authorize, have strict access rights and corresponding frequency of access. A formal process, apply for Appkey and get Appsecret, so that the corresponding permissions and the corresponding frequency of visits. Of course there are other ways to avoid this limitation. There is more than one way to get OAuth2 authorization (refer to http://netment.iteye.com/blog/945402). The article mentions that it can be done by using a username and password mode, so that we can construct a similar:


grant_type=password&client_id=s6bhdrkqt3&client_secret=47hdu8s&username=johndoe&password= a3ddj3w

This URL request is requested to request access Token. At this point, you will need a appkey and Appsecret with advanced permissions. Fortunately, many of the app and client's official microblogging Appkey and Appsecret can be found on the web (see http://fengmk2.github.io/blog/appkey.html)


The original point of this




Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.