Main Accounting Two graduation design thesis DDL imminent, do is to analyze the food enterprise accounting information and stock price of empirical issues, currently need to collect 100 food companies from Sina Finance nearly five years of financial results, if manually collected by the SFC 2014 4 quarterly listed companies industry classification result
Listed company stock code to the stock home _ Sina Finance
Search box, and then from the selected company's webpage (e.g. Condal (000048) stock price, quotes, news, earnings data
Click on the "Company annual Report" To download the annual report data for nearly five years.
Selected companies are the results of the 2014 4 quarterly listed Companies industry classification
On all 13, 14, 15 categories, there are more than 100, all manually collected words of the workload is slightly large, want to ask if there is no way to write a script in Python to complete the above work? (The university took a python-speaking computational thinking, a little bit of Python basis it)
Thank you ~
Hey, I'm here to answer the question ~
Although the main problem has been fixed ...
week after the question has been taken care of, with the Excel Power Query+yahoo finance API and so on this week to finish the graduation design back to update the problem ... Thank you very much!
There are many solutions to the problem, as practiced hand. Using the existing API is very convenient. But I still follow the main idea stupid way to write and try.
The same as the usual side of the tune to write ~
#新手 is stupid, big God, not to spray novice more communication
The first step is to collect the stock code ... Use the online Pdf2doc website, and then copy and paste the 13, 14, 153 class stock codes into a text document. Like this...
then we need to have Python read the contents of the text document into a list by line. Very simple. Then we need to have Python read the contents of the text document into a list by line. Very simple.
F=Open(' Stock_num.txt ')Stock =  for Line inch F.ReadLines(): #print (line,end = ") Line = Line.Replace('\ n','') Stock.Append( Line)F.Close()Print(Stock)
Use the Selenium module to write a program that simulates the entire process of manually clicking a button.
It feels like a button wizard.
Just the sauce. Scrapy with Chrome or Firefox for minutes
It is recommended to use East NET to capture data because it can be saved directly as an Excel document
, the post-processing is relatively convenient, the idea is as follows:
1. First get the stock code and name of the listed company that you need. This step can refer to @ Chaixiao's answer!
2. Analyze the download link address. Take Condal as an example, the annual report address / http soft-f9.eastmoney.com/s oft/gp14.php?code=00004802
, download the link eastmoney.com page
, the top 6 of the 8 digits at the end of the link are the stock code, the last two bits 01 represent the SSE listed company (stock code 60 begins), and 02 represent the listed company of the Szse. After that, you can use a loop to download all the data!
3. Convert the downloaded XML file into an XLS file with the following code:
1). Possible Chinese encoding error handling in XML
def Xml_error_c(filename): Fp_xml=Open(filename) fp_x=''#中文乱码改正 for I inch Range(OS.Path.GetSize(filename)): I+=1 a=Fp_xml.Read(1) if a==' & ': Fp_xml.Seek(-1,1) if Fp_xml.Read(6)==' ': I+=5 Continue Else: Fp_xml.Seek(-5,1) fp_x+=a Fp_xml=Open(filename,' w+ ') Fp_xml.Write(fp_x) Fp_xml.Flush() Fp_xml.Close()
With the arrow hand cloud Crawler, completely in the cloud. Write fast, and bring your own data export publish and generate charts for data analysis, Big Data era of the sharp Weapon AH (￣▽￣) "with Tushare, / http tushare.waditu.com
Write a crawler with scrapy, climb the resources to swish fast! If you want to download the "annual report data" instead of "annual Report", use Wind's Excel plugin to pull the function, you can have anything you want ... The landlord read the accounting profession, the school certainly has the business school, has the business school must have wind terminal ... To the college room half an hour to fix ...