How do I write a script that captures the company's annual report in Sina's financial network with Python?

Source: Internet
Author: User
Tags excel power
Main Accounting Two graduation design thesis DDL imminent, do is to analyze the food enterprise accounting information and stock price of empirical issues, currently need to collect 100 food companies from Sina Finance nearly five years of financial results, if manually collected by the SFC 2014 4 quarterly listed companies industry classification result Listed company stock code to the stock home _ Sina Finance Search box, and then from the selected company's webpage (e.g. Condal (000048) stock price, quotes, news, earnings data Click on the "Company annual Report" To download the annual report data for nearly five years.
Selected companies are the results of the 2014 4 quarterly listed Companies industry classification On all 13, 14, 15 categories, there are more than 100, all manually collected words of the workload is slightly large, want to ask if there is no way to write a script in Python to complete the above work? (The university took a python-speaking computational thinking, a little bit of Python basis it)
Thank you ~

Reply content:

Hey, I'm here to answer the question ~
Although the main problem has been fixed ...
week after the question has been taken care of, with the Excel Power Query+yahoo finance API and so on this week to finish the graduation design back to update the problem ... Thank you very much!
There are many solutions to the problem, as practiced hand. Using the existing API is very convenient. But I still follow the main idea stupid way to write and try.
The same as the usual side of the tune to write ~
#新手 is stupid, big God, not to spray novice more communication
#start coding
The first step is to collect the stock code ... Use the online Pdf2doc website, and then copy and paste the 13, 14, 153 class stock codes into a text document. Like this...
then we need to have Python read the contents of the text document into a list by line. Very simple. Then we need to have Python read the contents of the text document into a list by line. Very simple.
F=Open(' Stock_num.txt ')Stock = [] for  Line inch F.ReadLines():    #print (line,end = ")     Line =  Line.Replace('\ n','')    Stock.Append( Line)F.Close()Print(Stock)
Use the Selenium module to write a program that simulates the entire process of manually clicking a button.
It feels like a button wizard.
Just the sauce. Scrapy with Chrome or Firefox for minutes It is recommended to use East NET to capture data because it can be saved directly as an Excel document, the post-processing is relatively convenient, the idea is as follows:
1. First get the stock code and name of the listed company that you need. This step can refer to @ Chaixiao's answer!
2. Analyze the download link address. Take Condal as an example, the annual report address / http oft/gp14.php?code=00004802 , download the link page , the top 6 of the 8 digits at the end of the link are the stock code, the last two bits 01 represent the SSE listed company (stock code 60 begins), and 02 represent the listed company of the Szse. After that, you can use a loop to download all the data!
3. Convert the downloaded XML file into an XLS file with the following code:
1). Possible Chinese encoding error handling in XML
def Xml_error_c(filename):    Fp_xml=Open(filename)    fp_x=''#中文乱码改正     for I inch Range(OS.Path.GetSize(filename)):        I+=1        a=Fp_xml.Read(1)        if a==' & ':            Fp_xml.Seek(-1,1)            if Fp_xml.Read(6)==' ':                I+=5                Continue            Else:                Fp_xml.Seek(-5,1)        fp_x+=a    Fp_xml=Open(filename,' w+ ')    Fp_xml.Write(fp_x)    Fp_xml.Flush()    Fp_xml.Close()
With the arrow hand cloud Crawler, completely in the cloud. Write fast, and bring your own data export publish and generate charts for data analysis, Big Data era of the sharp Weapon AH ( ̄▽ ̄) "with Tushare, / http Write a crawler with scrapy, climb the resources to swish fast! If you want to download the "annual report data" instead of "annual Report", use Wind's Excel plugin to pull the function, you can have anything you want ... The landlord read the accounting profession, the school certainly has the business school, has the business school must have wind terminal ... To the college room half an hour to fix ...
  • Related Article

    Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.