How pooling works with Python and how pooling works with python
This article first describes the operations related to pooling, analyzes some of the principles behind pooling, and finally provides the Python Implementation of pooling.
I. operations related to pooling
First, the overall concept of pooling is intuitive (that is, the input, output, and specific functions of pooling are described, but the specific implementation details are ignored ): the input of pooling is a matrix, and the output is a matrix. The function is to operate a local area of the input matrix, so that the output of the region can best represent the characteristics of the region. 1. The yellow matrix on the left represents the input matrix, and the blue matrix on the right represents the output matrix. The dynamic orange matrix represents a local area of the selected input matrix, find an optimal representative of the region, and sort all the selected representatives in the output matrix according to the spatial location relationship corresponding to the original input matrix.
This process can be analogous to the election process. If you want to select the mayor of Beijing, one feasible way is to select a representative that best matches the rights and interests of the district in each district of Beijing, and then the elected representatives decide how to select the mayor of Beijing. Of course, we hope that the representatives selected by each district can best meet the rights and interests of the region. A simple analogy with pooling: Beijing <-> input matrix; Chaoyang district, Haidian District, and other <-> local areas; district Representatives <-> output matrix (if they sit at the meeting location, this is similar to the characteristics of pooling ).
Figure 1 features of pooling
Ii. principles behind pooling
In the process of selecting a representative in a local area, we generally choose the most prestigious person in the area as the representative (corresponding to max pooling) you can also select the people that best represent the general characteristics of all people in the region as the representative (corresponding to mean pooling). There are also two common practices in pooling: the biggest win of the local region value is the representative of the region or the average of all the values in the region is the representative of the region.
Selecting the most prestigious person in the region as the representative vs selecting the person that best represents the general characteristics of all people in the region is advantageous:
1) The most prestigious person in a local region should not be biased when selecting a mayor, but he may depend on the elders and cannot represent the views of the general people in the region (the largest value in a local region, it is easy to ignore the general features of the region)
2) Although the person most representative of the general characteristics of all people in the region can represent the greatest interests of all residents in the region, his cognitive ability is limited (the local mean is small, therefore, he has limited cognitive ability) and is prone to deviations when selecting a mayor.
3) If there is a certain degree of free activity for people in the region (corresponding to translation and rotation immutability), there is basically no impact on the two methods.
Formal explanation of pooling
According to relevant theories: (1) the variance of the estimated value increases due to restricted neighborhood size; (2) the deviation of the estimated mean caused by error. In general, mean-pooling can reduce the first error, retain more background information, max-pooling can reduce the second error, and retain more texture information.
In general, the input and output dimensions of pooling are high and low, which can be understood as dimensionality reduction to a certain extent. Based on the above explanation of the pooling principle, we can infer that, this dimension reduction process greatly preserves some of the most important input information. In the actual application of pooling, we need to analyze the characteristics of the actual problem. Actually, knowing the operation and principle of pooling is a good innovation if she works well with specific problems.
Iii. Python Implementation of pooing
Some of my thoughts on code writing are as follows. The core is to split a complicated problem into a problem that can be directly implemented using code:
1) The input matrix can be mxn or mxnxp. If you directly consider these two forms of code, you cannot start with it (there are a lot of situations to consider, and multi-dimensional matrices are easy to confuse myself ). Careful analysis shows that if I implement the pooling of the mxn matrix, the mxnxp matrix can be easily implemented using the mxn matrix.
2) For mxn matrix input, it is possible that the orange box in Figure 1 cannot exactly overwrite the input matrix. Therefore, we need to expand the input matrix. Expansion is also very simple. As long as the poolSize corresponding to the last poolStride can overwrite the input matrix, others can certainly overwrite the input matrix.
3) Finally, the for loop performs similar operations.
def pooling(inputMap,poolSize=3,poolStride=2,mode='max'): """INPUTS: inputMap - input array of the pooling layer poolSize - X-size(equivalent to Y-size) of receptive field poolStride - the stride size between successive pooling squares OUTPUTS: outputMap - output array of the pooling layer Padding mode - 'edge' """ # inputMap sizes in_row,in_col = np.shape(inputMap) # outputMap sizes out_row,out_col = int(np.floor(in_row/poolStride)),int(np.floor(in_col/poolStride)) row_remainder,col_remainder = np.mod(in_row,poolStride),np.mod(in_col,poolStride) if row_remainder != 0: out_row +=1 if col_remainder != 0: out_col +=1 outputMap = np.zeros((out_row,out_col)) # padding temp_map = np.lib.pad(inputMap, ((0,poolSize-row_remainder),(0,poolSize-col_remainder)), 'edge') # max pooling for r_idx in range(0,out_row): for c_idx in range(0,out_col): startX = c_idx * poolStride startY = r_idx * poolStride poolField = temp_map[startY:startY + poolSize, startX:startX + poolSize] poolOut = np.max(poolField) outputMap[r_idx,c_idx] = poolOut # retrun outputMap return outputMap
# Test an instance
Test = np. array ([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12])
Test_result = pooling (test, 2, 2, 'max ')
Print (test_result)
Test results:
Summary: first understand the input, output, and functions of a technology, then look for similar examples in life, and finally, break down the technology into achievable steps.