A regression analysis of stocks and indices

Source: Internet
Author: User
Tags statsmodels stock prices

# A regression analysis of stocks and indices # 1.1 data Load Python libraries required for load analysis
import  Statsmodels.api as  smimport  Statsmodels.formula.api as  smfimport  Statsmodels.graphics.api as  smgimport  patsy% Matplotlib inlineimport  matplotlib.pyplot as  Pltimport  numpy as  NP import  pandas as  PD from  pandas import  Series,dataframefrom  scipy import  statsimport  seaborn as  SNS  
The period from January 1, 2015 to December 31, 2015 is determined.
import datetimestart = datetime.datetime(2015,1,1)end = datetime.datetime(2015,12,31)
To obtain the "Shanghai Composite" 2015 share price data, recorded as Datasz, the "robot" Company 2015 stock price data, recorded as DATAJQR.
fromimport DataReaderdatass = DataReader("000001.SS","yahoo",start,end)datajqr = DataReader("300024.SZ","yahoo",start,end)
D:\software\ new Folder (4) \lib\site-packages\pandas\io\data.py:33:futurewarning:the Pandas.io.data module is moved to a Separate package (Pandas-datareader) and'll is removed from pandas in a future version. After installing the Pandas-datareader package (Https://github.com/pydata/pandas-datareader), you can change the import " From Pandas.io import data, WB ' to ' from Pandas_datareader import data, WB. futurewarning)
datass.head()
High Low
Open Close Volume ADJ Close
Date
2015-01-05 3350.52 3350.52 3350.52 3350.52 0 3350.52
2015-01-06 3351.45 3351.45 3351.45 3351.45 0 3351.45
2015-01-07 3373.95 3373.95 3373.95 3373.95 0 3373.95
2015-01-08 3293.46 3293.46 3293.46 3293.46 0 3293.46
2015-01-09 3285.41 3285.41 3285.41 3285.41 0 3285.41
datajqr.head()
Open high low Close Volume ADJ Close
Date
2015-01-01 39.39 39.39 39.39 39.39 0 39.37083
2015-01-02 39.39 39.39 39.39 39.39 0 39.37083
2015-01-05 38.83 39.33 37.30 39.01 20750100 38.99101
2015-01-06 38.76 41.29 38.50 41.22 24357600 41.19994
2015-01-07 41.21 41.60 40.05 40.18 16364700 40.16044
# # 1.2 Stock and Shanghai index data Exploratory analysis
close_ss = datass["Close"]close_jqr = datajqr["Close"]
Get a simple statistical result of the closing price of the Shanghai Composite 2015 trading day, as shown below. A total of 233 stock prices, the average index of 3739.79, the minimum value is 2927.29, the maximum value is 5166.35.
close_ss.describe()

Count 233.000000
Mean 3739.794893
STD 538.105387
Min 2927.290000
25% 3320.680000
50% 3617.060000
75% 4034.310000
Max 5166.350000
Name:close, Dtype:float64

Get a simple statistical result of the closing price of the robot company for each trading day in 2015, as shown below. A total of 261 stock price figures, the average price of 67.31, the minimum value is 39.01, the maximum value is 128.00.

close_jqr.describe()

Count 261.000000
Mean 67.317433
STD 20.643055
Min 39.010000
25% 51.800000
50% 68.500000
75% 82.550000
Max 128.000000
Name:close, Dtype:float64

Looking at the stock price fluctuations of the Shanghai Composite and the robotics companies, as shown below, there is a relatively consistent trend between the Shanghai Composite and the robotics company's share price volatility, which is more volatile than the Shanghai Composite.

fig,ax = plt.subplots(nrows=1,ncols=2,figsize=(15,6))close_ss.plot(ax=ax[0])ax[0].set_title("SZZZ")close_jqr.plot(ax=ax[1])ax[1].set_title("JQR")
<matplotlib.text.Text at 0x76712e47f0>

Based on the trading day of the data, the intersection of the Shanghai Composite Index and the robot company's 2015 share price is extracted, as shown below.

TrueTrue)stock = stock[["Close_x","Close_y"]]stock.columns = ["SZZZ","JQR"]stock.head()
szzz JQR
Date
2015-01-05 3350.52 39.01
2015-01-06 3351.45 41.22
2015-01-07 3373.95 40.18
2015-01-08 3293.46 40.15
2015-01-09 3285.41 39.36

The daily yield sequence of the Shanghai Composite and the robotics company is based on the share price, as shown below.

1)).dropna()daily_return.head()
szzz JQR
Date
2015-01-06 0.000278 0.056652
2015-01-07 0.006714 -0.025230
2015-01-08 -0.023856 -0.000747
2015-01-09 -0.002444 -0.019676
2015-01-12 -0.017072 0.004827

Observe the simple statistical value of the daily yield sequence as shown below. The average daily yield of the Shanghai Composite is 0.000556, the minimum value is-0.0849, and the maximum is 0.0769. The average value of a robot's stock is 0.003665, the minimum is 10.00, and the maximum data is an outlier.

daily_return.describe()
szzz JQR
Count 232.000000 232.000000
Mean 0.000556 0.003665
Std 0.025194 0.050061
Min -0.084907 -0.100017
25% -0.011398 -0.021297
50% 0.002583 -0.000724
75% 0.016720 0.026968
Max 0.076940 0.209524

Observing outliers data

daily_return[daily_return["JQR"0.105]
szzz JQR
Date
2015-10-12 0.07694 0.209524

After analysis, the reason for the abnormal stock price data is the missing of the stock price data of two trading days on October 8 and October 9, which results in the calculation base of the daily yield is September 30, 2015.

Chart of the daily yield fluctuation of the Shanghai Composite and robotics companies

fig,ax = plt.subplots(nrows=1,ncols=2,figsize=(15,6))daily_return["SZZZ"].plot(ax=ax[0])ax[0].set_title("SZZZ")daily_return["JQR"].plot(ax=ax[1])ax[1].set_title("JQR")
<matplotlib.text.Text at 0x7671a40dd8>

The daily yield histogram and density map of the Shanghai Composite and robot companies are drawn, as shown below, and as a whole, the daily yield of the Shanghai Composite and the robotics company is normally distributed. Robotic companies, by contrast, have a lower daily yield than the Shanghai Composite.

fig,ax = plt.subplots(nrows=1,ncols=2,figsize=(15,6))sns.distplot(daily_return["SZZZ"],ax=ax[0])ax[0].set_title("SZZZ")sns.distplot(daily_return["JQR"],ax=ax[1])ax[1].set_title("JQR")
<matplotlib.text.Text at 0x76725906a0>

Draw a scatter chart of the daily yield of the Shanghai Composite and robotics companies, as shown below.

fig,ax = plt.subplots(nrows=1,ncols=1,figsize=(12,6))plt.scatter(daily_return["JQR"],daily_return["SZZZ"])plt.title("Scatter Plot of daily return between JQR and SZZZ")
<matplotlib.text.Text at 0x76726657b8>

The scatter plot shows that the stock price of the Shanghai Composite and the robotics company may have a linear positive correlation.

Regression analysis of 1.3 shares and Shanghai Composite Index
importas sm

Add intercept items.

daily_return["intercept"]=1.0

A stock is an independent variable, the Shanghai Composite Index is a dependent variable, and a regression analysis of the stock and the Shanghai Composite. Get the regression results as shown below.

model = sm.OLS(daily_return["JQR"],daily_return[["SZZZ","intercept"]])results = model.fit()results.summary()
/tr>
OLS Regression Results
Dep. Variable: jqr r-squared: 0.382
M Odel: OLS Adj. r-squared: 0.379
Method: Least Squares f-statistic: 142.0
Date: Fri, April Prob (f-statistic): 8.29e-26
time: 22:16:56 log-likelihood: 421.79
No. Observations: 232 AIC: -839.6
Df residuals: BIC: -832.7
Df Model: 1
covariance Type: nonrobust
Coef STD Err T p>|t| [95.0% Conf. Int.]
Szzz 1.2275 0.103 11.915 0.000 1.025 1.431
Intercept 0.0030 0.003 1.151 0.251 -0.002 0.008
Omnibus: 8.703 Durbin-watson: 1.824
Prob (Omnibus): 0.013 Jarque-bera (JB): 9.653
Skew: 0.350 Prob (JB): 0.00801
Kurtosis: 3.714 Cond. No. 39.8

The regression results of single-element least squares show that there is a significant positive correlation between the daily return rate of stock and the daily yield of the Shanghai Composite. The omnibus coefficient is 0.382, which indicates that the daily yield variable of the Shanghai Composite has strong explanatory power to the robot's daily yield variable, the model fitting result is good, the F statistic and the P-value of the statistic are close to 0, the function of the independent variable is significant. The P-value of T-Statistic is close to 0, which indicates that the index is significant. The coefficient of self-variable is 1.2275, which indicates that the daily yield fluctuation of the robot company is larger than that of the Shanghai Composite, the risk of the stock is greater, and the potential gains and losses are greater. On average, the daily yield of the Shanghai Composite Index fluctuated 1%, and the stock daily yield fluctuated 1.2275%. The value of the Durbin-waston test is 1.824, indicating that there is no sequence correlation for the yield data. The P-value of Jarque-bera is close to 0, which indicates that the daily yield data obeys normal distribution.

A regression analysis of stocks and indices

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.