Summary of Probability Theory
Relationship between each chapter in probability theory
First, the development of mathematics makes our description of the definite phenomenon quite accurate, but there are still some phenomena that are "unclear ", this unclear nature is random. In order to better describe this property probability, the subject probability theory that studies the property of probability also came into being. The Early Probability theory is very simple to describe. For example, the probability of coin throwing and the probability of lottery are based on the two facts: 1. Basic events are possible. 2. The basic events that constitute the whole are limited. Then, with the further understanding of the random phenomenon, we found that the basic events of many things could not be exhaustive, the image in the description forms a probability based on geometric properties-geometric probability. In this way, it is easier to describe the infinite and non-column events that correspond to different graphs. For example, the central probability of archery. However, this probability is still built on the premise that an area is evenly distributed, that is, the probability of a basic event is the same, or the probability of a region block with the same area is the same. Of course, this uniformity is our hypothetical condition. If this condition is not true, It is the prototype of modern probability theory in the third stage. We have introduced the theory and chemistry definition of probability. In the measurement theory, we define the probability as a real-value set function corresponding to any subset in the measurable space. So we studied the properties of the Set and the algorithm in this space.
In order to better study the probability, we define random variables in the probability space and study the distribution of probabilities with different values of random variables, therefore, the distribution columns of random variables (discrete) and single-point values correspond to continuous variables as distribution functions and distribution density functions, and some specific distribution properties are studied in depth, some conclusions are drawn.
Then we find that in some cases, the actual events are not so simple, and there are often associations between the two events. Therefore, we can expand the dimensions of a single variable based on it, we started to study the joint probability of multidimensional variables. The continuous variables correspond to the Joint Distribution in the joint distribution density function. The Discrete Random Variables correspond to the Joint Distribution column and Two-Dimensional Single Point probability, the properties of some distributed functions are studied in depth, and some conclusions are obtained.
At the same time, people have noticed that the relationship between some events is not at the same level. It is possible that the occurrence of one event often leads to the occurrence of another event, that is to say, in the Association of multiple events mentioned above, if the association of these two events belongs to the special situation of causal association, that is to say, if the occurrence of the previous event is the condition of the occurrence of the latter event, such association can be separately proposed for research, that is, the generation of the conditional probability we will talk about later. We also studied the same thing as the above two for conditional probability.
At this point, the probability of the event is basically clear, but in practice, we find that, in fact, in many cases, we do not know at all or it is difficult to know the probability of an event or what distributed functions it obeys, however, we also find that we do not need to know the probability distribution function of the event to solve the actual problem. We only need to know some of the key features. At this layer, the Digital features of the so-called random variables are generated. We found that this feature is very useful, and put forward the concept of "moment" in this feature, that is, the original and central moments. Generally, we use the first-order original moment-expectation, and second-order Central Moment-variance. A maximum of four moment values are not used. Similarly, after defining the concept of moment, we have made further research on the corresponding moment of the random variables that obey the special distribution. We have obtained some conclusions and extended them to the multi-dimensional moment, conditional moment. During this period, we defined the measurement of the linear relationship between events-the so-called covariance and correlation coefficient, and found that the linear independence of random variables subject to the normal distribution is inseparable from each other.
But in fact, our research on moments is mainly based on statistics on the occurrence of real-life events, or our sampling to approximate this general feature, however, why a large number of samples can well approximate such a general digital feature is not scientifically proven, after all, unlike Classical probability that is equivalent probability. So we have the law of large numbers and the central limit theorem. The law of big numbers illustrates the fact thatWhen a large sample is drawn, the mean value of all samples and the expected mean value of the sample population are close to a very high probability.This theorem is powerful because no matter whether the samples are mutually independent or related, or whether they are extracted from the same population or not, as long as N tends to be infinite, the mean value of the sample must be close to the expected mean with a high probability. If we only consider this special case where samples are mutually independent, We will obey the law of large numbers when both expectations and variance exist (the Law of big numbers in cheshovev; If samples are not mutually independent, in addition, it is subject to the same distribution. When both expectation and variance exist, it is subject to the law of large numbers (independent of the law of large numbers in the same distribution ). However, this restriction can be relaxed, that is, only when expectation exists is subject to the law of big numbers (Xin Qin's law of big numbers, it explains the arithmetic mean values of a large number of measured values as the theoretical basis for estimation of precise values; if the sample is obtained from the "Bay effort test", the frequency of its occurrence approaches probability (Bay effort's law of big numbers) another Angle Based on this theory leads to the small probability inference principle-if the frequency of an event is too small, the corresponding probability is also very small, so the probability of a single occurrence is basically zero. The central limit theorem illustrates this problem:The extreme distribution of any mutually independent standardized random variable sequence approaches the normal distribution.. When these random variable sequences are subject to the same distribution, they are the central limit theorem of independent distribution. When the random sequence is obtained by Bay's experiments, now we have the extreme theorem of Momo-Laplace center.
At this point, the content of all probabilities is basically complete, but in the middle of the 18 century, feature functions were generated due to the emergence of Fourier. He found that the distribution function is actually a function that represents the frequency, and is associated with another time-domain function feature function-Fourier transformation and its inverse transformation. The discovery of feature functions makes it of special significance to solve the moment of a random variable. It makes the moment of resolution as easy as the derivation, therefore, the feature functions and conditional feature functions of multidimensional random variables are studied accordingly. The corresponding distribution function for discrete random variables is the primary function, and we also study its corresponding properties.
In the future, people gradually realized that we can know the characteristics of random variables, but our so-called random variables actually correspond to the basic events one by one, the occurrence of a basic event is only an instant at a time point. What if we want to study the value of a specific event for a period of time. Therefore, the concept of a random process is generated, and the random variable is added to the time parameter.