Part I: Finance and quantitative investment
Stock:
- The stock is a kind of voucher that the stock company sends to the investor, and the holder is the shareholder of the corporation.
Par value and market value of shares
- Face value indicates the par value
- Market value indicated
Listed/ipo:
- Companies open to the community to raise funds through stock exchanges
The role of stocks:
- Proof of capital contribution, identification of shareholders, and comment on company operation
- Company dividends, trading profit
Classification of stocks
Stocks by performance classification:
- Blue-chip: Stocks of companies with strong capital and good reputation
- Blue: Stocks of good companies
- St shares: Special handling of stocks, two consecutive years of loss or net assets per share below the par value of the stock
Stocks are classified by listing area:
- A shares: Mainland China listing, RMB subscription trading (t+1, change 10%)
- B Shares: Mainland China listing, foreign currency subscription trading (t+1,t+3)
- H shares: listed in Hong Kong, China (t+0, no limit on change)
- N shares: listed in New York, USA
- S shares: Singapore listing
The composition of the stock market
- Listed Companies
- Investors (including institutional investors)
- SFC, Securities Industry Association, Exchange
- Securities intermediary agencies
Exchange
- Shanghai Stock Exchange: Only one motherboard (prev)
- Shenzhen Stock Exchange:
- Motherboard: Large Mature Enterprise (Shen Chengzhi)
- SME board: Small scale of operation
- Gem: A start-up enterprise that is still in the growth period
Factors that affect stock prices
- The company's own factors: the value of the stock itself is the most basic factor in determining the stock price, and this depends mainly on the operating performance of the issuing company, the level of credit and the associated dividend dividend distribution status, development prospects, stock expected return level and so on.
- Industry factors: The change in the status of the industry in the national economy, the development prospects and development potential of the industry, the impact of emerging industries, as well as the position of listed companies in the industry, operating performance, operating conditions, changes in the financial mix and leadership personnel changes will affect the price of the relevant stocks.
- Market factors: The trend of investors, the intention and manipulation of large-size companies, co-operation or mutual shareholding, credit transactions and futures trading, the arbitrage behavior of speculators, the company's capital increase and increase the amount of capital, etc., may have a greater impact on stock prices.
- Psychological factors: emotional fluctuations, judgment mistakes, blindly follow the big, crazy throw snapping
- Economic factors: economic cycle, national financial situation, financial environment, balance of payments, changes in the economic status of the industry, adjustment of the national exchange rate, etc.
- Political factors
Stock trading (A-shares)
- entrusted to buy and sell stocks: individuals can not directly buy and sell, need to open a brokerage, commissioned to buy
- Stock trading Day: Monday to Friday (non-statutory holidays and exchange rest days)
- Stock Trading Hours:
- 9:15-9:25 Open Set Auction time
- 9:30-11:30 pre-market, continuous bidding time
- 13:00-15:00 future, continuous bidding time
- 14:57-15:00 szse closing Rally auction time
- T+1 Trading System: A stock cannot be sold on the day after it has been bought, to be sold on the next trading day after buying
- Limit of trading, falling and stopping
Financial Analysis
Fundamental analysis
- Macroeconomic analysis: National fiscal policy, monetary policy, etc.
- Industry Analysis
- Company analysis: Financial data, performance reports, etc.
Technical Analysis: Various technical indicators
- Candlestick
- MA (EMA)
- KDJ (random indicator)
- MACD (exponential smoothed moving average)
- ......
Candlestick
Financial Quantitative Investment
- Quantitative investment: Use of computer technology and a certain mathematical model to practice the concept of investment, the process of realizing the investment strategy.
- Advantages of quantifying Investments:
- Avoid subjective emotions, human weaknesses and cognitive biases, and choose more objective
- Can include multi-angle observation and multi-level model
- Keep track of market changes, find new statistical models and find trading Opportunities
- After deciding on the investment strategy, the results can be verified by backtesting.
Quantification Strategy
- Quantitative strategy: Analyze, judge and make decisions through a set of logic, and automate stock trading.
- Core content
- Stock selection
- Select time
- Position Management
- Stop loss for Take profit
- Period of policy
- Generate Ideas/Learn knowledge
- Implementation strategy: Python
- Inspection strategy: backtesting/Analog Trading
- Real Trading
- Optimization Policy/Discard policy
Part II: Quantifying investment and python
Quantify investment and Python
- Why Choose python?
- Other options: Excel, SAS/SPSS, R
- Quantitative investment third-party related modules
- NumPy: Numerical calculation
- Pandas: Data analysis
- Matplotlib: Drawing Icons
- How to quantify investments with Python
- Write yourself: numpy+pandas+matplotlib+ ...
- Online platform: Poly-width, excellent ore, rice basket, Quantopian 、......
- Open source framework: Rqalpha, Quantaxis 、......
Ipython: Interactive python command line
- IPython: Install: Pip install IPython
- TAB key Auto-complete
- ? commands (Introspection, namespace search)
- Execute system command (!)
- %run Command Execution file code
- %paste%cpaste command to execute clipboard code
- Interacting with the editor and the IDE
- Magic command:%timeit%pdb ...
- Using the command history
- Input and output variables (_, __, _2, _I2)
- Directory Bookmark System%bookmark
- Ipython Notebook
Ipython Common Magic Commands
Python Debugger Commands
Ipython shortcut keys
NumPy: Array calculation
- NumPy is the foundation package for high-performance scientific computing and data analysis. It is the basis of various other tools such as pandas.
- Main functions of NumPy:
- Ndarray, a multidimensional array structure, efficient and space-saving
- Mathematical functions that do not require a loop to perform fast operations on an entire set of data
- * Tools to read and write disk data and tools for manipulating memory-mapped files
- * Linear algebra, random number generation and Fourier transform functions
- * Tools for integrating code such as C, C + +
- Installation method: Pip Install NumPy
- Citation method: Import NumPy as NP
numpy:ndarray-Multidimensional Array objects
- Create Ndarray:np.array ()
- Why to use Ndarray:
- Example 1: The market value (USD) of several multinational companies is known and converted into renminbi
- Example 2: The price of each item in the cart is known and the number of items, the total amount
- Ndarray can also be multidimensional arrays, but element types must be the same
- Common Properties:
- Transpose of the T array (for high-dimensional arrays)
- Dtype the data type of an array element
- Number of size array elements
- Dimensions of the Ndim array
- Dimension size of the shape array (in tuples)
numpy:ndarray-Multidimensional Array objects
- Dtype
- Bool_, Int (8,16,32,64), uint (8,16,32,64), float (16,32,64)
- Type conversion: Astype ()
- Create Ndarray:
- Array () Converts the list to an array, optionally specifying Dtype explicitly
- NumPy version of Arange () range, floating point support
- Linspace () similar to Arange (), the third parameter is an array length
- Zeros () Creates a full 0 array based on the specified shape and Dtype
- Ones () Creates a full 1 array based on the specified shape and Dtype
- Empty () Creates a null array (random value) based on the specified shape and Dtype
- Eye () creates a unit matrix based on the specified edge length and Dtype
NumPy: Indexes and slices
- Operations between arrays and scalars
- Operations between arrays of the same size
- Index of the array
- Slices of an array
- A[5:8]
- A[:3] = 1
- A2[1:2,: 4]
- A2[:,:1]
- a2[:,1]
- Unlike lists, array slices are not automatically copied, and modifications on the slice array affect the original array.
- b = A[:4]
- B[-1] = 250
- Workaround:
- Copy () "B = a[:4] b[-1] = 250
NumPy: Boolean index
- Problem: Give an array that selects all the numbers greater than 5 in the array.
- Answer: A[a>5]
- Principle: A>5 will judge each element in a, return a Boolean array Boolean index: the same size of the Boolean array is passed into the index, will return an array of all the elements corresponding to the true location
- Question 2: Give an array that selects all even numbers greater than 5 in the array.
- Question 3: Given an array, select all the numbers greater than 5 and the even number in the array.
- Answer: a[(a>5) & (a%2==0)] a[(a>5) | (a%2==0)]
NumPy: Fancy Index *
- Question 1: For an array, select its 1,3,4,6,7 element to form a new two-dimensional array.
- Question 2: For a two-dimensional array, select its first and third columns to form a new two-dimensional array.
NumPy: General functions
- General function: A function that can operate on all elements of an array at the same time
- Common common functions:
- Unary functions: ABS, SQRT, exp, log, ceil, floor, rint, Trunc, MODF, isNaN, isinf, cos, sin, tan
- Binary functions: Add, substract, multiply, divide, power, mod, maximum, mininum,
NumPy: Mathematical and statistical methods
- Common functions:
- Sum sum
- Mean averaging
- STD Standard deviation V
- AR seeking Variance
- Min to find minimum value
- Max asks for maximum value
- Argmin to find the least value index
- Argmax index for maximum value
NumPy: Random number generation
- Common functions
- Rand given shape produces a random array (number between 0 and 1)
- Randint a given shape to produce a random integer
- Choice random selection for a given shape
- Shuffle is the same as Random.shuffle
- Uniform a given shape to produce a random array
Pandas: Data analysis
- Pandas is a powerful toolkit for data analysis in Python.
- Pandas is built on the basis of numpy.
- Main functions of Pandas
- A data structure with its functions dataframe, Series
- Integrated time series capabilities
- Provides a wealth of mathematical operations and operations
- Flexible handling of missing data
- Installation method: Pip Install Pandas
- Reference method: Import Pandas as PD
Pandas:series
- A series is an object that resembles a single array, consisting of a set of data and a set of data labels (indexes) associated with it.
- Series comparison like List (array) and dictionary binding body
- How to create:
- Pd. Series ([4,7,-5,3])
- Pd. Series ([4,7,-5,3],index=[' A ', ' B ', ' C ', ' d '])
- Pd. Series ({' A ': 1, ' B ': 2})
- Pd. Series (0, index=[' a ', ' B ', ' C ', ' d '])
- Get an array of values and an array of indexes:
- Values Property
- Index Property
Pandas:series characteristics
- Series supports the characteristics of the NumPy module (subscript):
- Create Series:series from Ndarray (arr)
- With scalar operations: sr*2
- Two series operations: SR1+SR2
- Index: sr[0], sr[[1,2,4]]
- Slices: sr[0:2] (slices are still in view form)
- General functions: Np.abs (SR)
- Boolean filter: Sr[sr>0]
- Statistic function: Mean () sum () cumsum ()
Pandas: Integer Index
- Pandas objects that are indexed by integers tend to make beginners mad.
- Cases:
- SR = NP. Series (Np.arange (4.))
- SR[-1]
- If the index is an integer type, it is always label-oriented when data is manipulated based on integers.
- The LOC attribute is interpreted as a label
- The Iloc attribute is explained in the following index
Pandas:series Data Alignment
- Pandas is aligned and then evaluated by index when the operation is performed. If there is a different index, the index of the result is the set of two operand indexes.
- Cases:
- SR1 = PD. Series ([12,23,34], index=[' C ', ' a ', ' d '])
- SR2 = PD. Series ([11,20,10], index=[' d ', ' C ', ' a ',])
- Sr1+sr2
- SR3 = PD. Series ([11,20,10,14], index=[' d ', ' C ', ' A ', ' B '])
- Sr1+sr3
- How do I set the missing value to 0 when I add two series objects?
- Sr1.add (SR2, fill_value=0)
- Flexible arithmetic methods: Add, Sub, Div, mul
Pandas:series Missing data
- Missing data: Use Nan (not a number) to represent missing data. Its value equals Np.nan. The built-in none value will also be treated as Nan.
- Methods for handling missing data:
- Dropna () filters out rows with a value of Nan
- Fillna () Fill missing data
- IsNull () returns a Boolean array with the missing value corresponding to True
- Notnull () returns a Boolean array with the missing value corresponding to False
- Filtering Missing data:
- Sr.dropna ()
- Sr[data.notnull ()]
- Fill missing data: Fillna (0)
Pandas:dataframe
- Dataframe is a tabular data structure that contains a set of ordered columns.
- Dataframe can be viewed as a dictionary consisting of series and share an index.
- How to create:
- Pd. DataFrame ({' One ': [1,2,3,4], ' both ': [4,3,2,1]})
- Pd. DataFrame ({' One ':p D. Series ([1,2,3],index=[' A ', ' B ', ' C ']), ' both ':p D. Series ([1,2,3,4],index=[' B ', ' A ', ' C ', ' d '])})
- ......
- CSV file read and write:
- Df.read_csv (' Filename.csv ')
- Df.to_csv ()
Pandas:dataframe Viewing data
- View data Common properties and methods:
- Index Get Indexes
- T Transpose
- Columns getting column indexes
- Values get array of value
- Describe () Get quick stats
- Dataframe Column Name property: Column Name
Pandas:dataframe indexes and slices
- Dataframe has a row index and a column index.
- Get through Tags:
- df[' A ']
- df[[' A ', ' B ']
- df[' A '][0]
- df[0:10][[' A ', ' C ']
- df.loc[:,[' A ', ' B ']
- df.loc[:, ' A ': ' C ']
- Df.loc[0, ' A ']
- df.loc[0:10,[' A ', ' C ']
- Get By Location:
- DF.ILOC[3]
- df.iloc[3,3]
- Df.iloc[0:3,4:6]
- Df.iloc[1:5,:]
- df.iloc[[1,2,4],[0,3]]
- Filter by Boolean:
- df[df[' A ']>0]
- df[df[' A '].isin ([1,3,5])
- Df[df<0] = 0
Pandas:dataframe data alignment and missing data
- When the Dataframe object is in operation, the row index of the result and the column index are the same as the row and column indexes of the two operands, respectively.
- Dataframe ways to handle missing data:
- Dropna (axis=0,how= ' any ',...)
- Fillna ()
- IsNull ()
- Notnull ()
Pandas: Other common methods
- Pandas common methods (for series and dataframe):
- Mean (Axis=0,skipna=false)
- SUM (Axis=1)
- Sort_index (axis, ..., ascending) sorted by row or column index
- Sort_values (by, axis, ascending) sorted by value
- The general function of NumPy is also applicable to pandas
- Apply (func, axis=0) applies the custom function to each row or column, and Func returns a scalar or series
- Applymap (func) applies the function to each element of the Dataframe
- Map (func) applies the function to each element of the series
*pandas: Hierarchical Index
- Hierarchical indexing is an important feature of pandas, which enables us to have multiple index levels on one axis.
- Example: DATA=PD. Series (Np.random.rand (9), index=[[' A ', ' a ', ' a ', ' B ', ' B ', ' B ', ' C ', ' C ', ' C '], [1,2,3,1,2,3,1,2,3]])
Pandas: Time Object processing
- Time Series Type:
- Timestamp: A specific moment
- Fixed period: As of July 2017
- Time interval: Start time-end time
- Python Standard library: datetime
- Date Time datetime Timedelta
- Dt.strftime ()
- Strptime ()
- Third-party Package: Dateutil
- Group Processing Date: Pandas
- Pd.to_datetime ([' 2001-01-01 ', ' 2002-02-02 '])
- Generating an array of time objects: Date_range
- Start time
- End time
- Periods Time Length
- Freq time frequency, default to ' D ', optional h (our), W (EEK), B (usiness), S (emi-) M (onth), (min) T (es), S (econd), A (year),...
Pandas: Time series
- A time series is a series or dataframe indexed by a time object.
- A DateTime object is stored in the Datetimeindex object as an index.
- Time Series Special functions:
- Pass in "year" or "month" as slicing method
- To pass in a date range as a slicing method
Pandas: Read from File
- Read files: Loading data from file names, URLs, files objects
- Read_csv default delimiter is CSV
- read_table default delimiter is \ t
- Read_excel reading Excel files
- Read the main parameters of the file function:
- SEP specifies delimiters, available regular expressions such as ' \s+ '
- Header=none specified file without column name
- Names specifying column names
- Index_col specifying a column as an index
- SKIP_ROW Specifies skipping certain rows
- Na_values specifies that some strings represent missing values
- parse_dates Specifies whether some columns are resolved to a date, a Boolean value, or a list
Pandas: Writing to File
- Write to File: To_csv
- Main parameters for writing to the file function:
- Sep
- NA_REP specifies a string for missing value conversions, default to an empty string
- Header=false not output column name row
- Index=false not output row index column
- COLS Specifies the output column, incoming list
- Other file types: JSON, XML, HTML, database
- Pandas converted to binary file format (pickle):
Matplotlib: Drawing and visualization
- Matplotlib is a powerful toolkit for Python drawing and data visualization.
- Installation method: Pip Install Matplotlib
- Reference method: Import Matplotlib.pyplot as Plt
- Drawing function: Plt.plot ()
- Display Image: Plt.show ()
Matplotlib:plot function
- Plot function:
- Linetype LineStyle (-,-.,--,.. )
- Point type marker (V,^,s,*,h,+,x,d,o,... )
- Colour Color (b,g,r,y,k,w,... )
- Plot function plots multiple curves
- Caption: Title
- X-axis: Xlabel
- Y-Axis: Ylabel
- Other types of images:
*matplotlib: Canvas and graph
- Canvas: Figure
- Figure: Subplot
- Ax1 = Fig.add_subplot (2,2,1)
- Adjust Sub-chart spacing:
- Subplots_adjust (left, bottom, right, top, wspace, hspace)
Day32 Python and financial Quantitative Analysis (II.)