Write a shell script to grab stock history data using wget

Source: Internet
Author: User

Today, big data boss gave me a task-grabbing stock history data. As a wget, I looked for it on the Internet and found that it was really a very powerful Linux download tool. I have been deeply shocked. The following is a description of some of today's processes, or more bumpy.

First, I use the company's existing stock data to query all stock codes and import them locally using hive:

" Use stock;select distinct secucode from T_stock_tick_shsz where type= ' sz '; " >>"usestock;select distinct secucode from T_stock_tick_shsz where type= ' sh '; " >> sh_secucode.txt

PS: The above step, because of a small problem--start without keyword distinct, resulting in late crawl data caught a lot of duplicate stock code data.

Just started to lazy, want to paste wget a sentence, but the stock code too much, so write the script, Shell script is as follows:

#下载上海交易所股票历史记录 #!/bin/bash
forIinch' Cat sh_secucode.txt ' Dowget--user-agent="mozilla/5.0 (Windows; U Windows NT 6.1; En-US) applewebkit/534.16 (khtml, like Gecko) chrome/10.0.648.204 safari/534.16" -NV--tries=5--timeout=5-o/home/bigdata/script/zj/sh_history/history_data/$I. csv http://quotes.money.163.com/service/chddata.html?code=0$i&end=20130430sleep 1s done #下载深圳交易所股票历史记录 #!/bin/Bash forIinch' Cat sz_secucode.txt ' Dowget--user-agent="mozilla/5.0 (Windows; U Windows NT 5.1; En-us; rv:1.9.2.3) gecko/20100401 firefox/3.6.3 (. NET CLR 3.5.30729)" -NV--tries=5--timeout=5-o/home/bigdata/script/zj/sz_history/history_data/$I. csv http://quotes.money.163.com/service/chddata.html?code=1$i&end=20130430sleep 1s done

PS: Say the above code, why the wget have user-agent this parameter? The students who have played reptiles must know, when you frequently download a website, this site will recognize that this is a crawler, so you have to refuse to download the resources of his home, so to set up a proxy, disguised as a browser to download files, so the probability of being found to laugh. And why do you want to add a sleep? This is because it is possible to have files that are larger and may be suspended after a few milliseconds without downloading. Of course, every file on my side is hundreds of k, so 1s is enough.

Finally, run the script, write this article, the script is still running, hope smooth! O (∩_∩) o

Write a shell script to grab stock history data using wget

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.