A very concise Python web crawler, its own initiative from the Yahoo Wealth by crawling stock data

Source: Internet
Author: User
Tags python web crawler

This program uses Python 2.7.6 to write, expand the python comes with the htmlparser, self-actively according to the preset stock code list, from Yahoo Finance crawl list of data date, stock name, real-time quote, change rate of the day, the lowest price of the day, the highest price of the day.

Because the values in the Yahoo Finance stock page have a corresponding ID.

Like the Nasdaq 100 ETF (QQQ) HTTP://FINANCE.YAHOO.COM/Q?S=QQQ
The HTML markup for real-time quotes is

<span id= "YFS_L84_QQQ" >87.49</span>

and the S & P 500 index ETF (SPY) Http://finance.yahoo.com/q?s=spy

The HTML markup for real-time quotes is

<span id= "Yfs_l84_spy" >187.25</span>

So this data crawler looks for data based on the corresponding ID string. In detail, first inherit Htmlparser, and then overload the Handle_data (self, data) method in the subclass of your definition to find the HTML tag that includes the corresponding ID string (such as the ID string for the real-time quote "yfs_l84_" + Stock code). and output the data in this HTML tag (such as QQQ's <span id= "YFS_L84_QQQ" >87.49</SPAN> the data 87.49 is the real-time quote. )


Sample output:

The data is sequentially

Data Date stock ticker stock name Real time quote daily Change rate daily lowest price daily high

05/05/2014ibbishares Nasdaq Biotechnology (IBB) 233.281.85%225.34233.2805/05/2014soclglobal X Social Media Index ETF ( SOCL) 17.480.17%17.1217.5305/05/2014pnqipowershares NASDAQ Internet (pnqi) 62.610.35%61.4662.7405/05/2014xsdspdr S &p Semiconductor ETF (XSD) 67.150.12%66.2067.4105/05/2014itaishares US Aerospace & Defense (ITA) 110.341.15% 108.62110.5605/05/2014iaiishares US broker-dealers (IAI) 37.42-0.21%36.8637.4205/05/2014vbkvanguard Small Cap Growth ETF (VBK) 119.97-0.03%118.37120.0905/05/2014qqqpowershares QQQ (QQQ) 87.950.53%86.7687.9705/05/2014ewiishares MSCI Italy Capped (EWI) 17.86-0.56%17.6517.8905/05/2014dfewisdomtree Europe SmallCap Dividend (DFE) 62.33-0.11%61.9462.3905 /05/2014pbdpowershares Global Clean Energy (PBD) 13.030.00%12.9713.0505/05/2014eirlishares MSCI Ireland Capped (eirl) 38.52-0.16%38.3938.60


This procedure source code:
Https://bitbucket.org/lsz/html-parser

Official documentation for Htmlparser:
Https://docs.python.org/2/library/htmlparser.html

Htmlparser (parsing HTML document elements)
http://blog.csdn.net/hxsstar/article/details/17241709

A very concise Python web crawler, its own initiative from the Yahoo Wealth by crawling stock data

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.