In the recent learning Pattern recognition and machine learning often use the knowledge of probability theory, simply re-review the knowledge of probability theory. The most important point of learning probability theory is not the memory of the formula, but the understanding of the meaning behind the formula. (This is true of learning any knowledge, but the probability may seem more "grounded" than the abstraction of high numbers)
One of the worst classes in college was the probability theory, but the recent study, after several struggles, has gradually found the pleasure of this course, and the little textbook that I studied in the undergraduate period completely covered up the interest of probability theory.
The learning probability theory must first understand the meaning of this course. The theory of probability, as the name implies, is the subject of the possibility of the occurrence of an event, which does not use the strict mathematical language of the book to describe it, but rather to understand the probability theory in a clear and unambiguous sense.
In the whole probability theory learning process, it is all around the calculation of probability. For example, the probability of discrete random events, the probability of a continuous random event, the conditional probability, the joint distribution of random variables, and the distribution of edges. Here the probability distributions of discrete random events and continuous random events refer to the case of a single random event, while the joint distribution and edge distribution is the case of multiple random events. When it comes to multiple events, there are often independent events and conditional distributions. The probability distribution of the function of the random variable (which is only related to a random variable), the probability distribution of the two random variables (two random variables), and the probability distribution of quotient are also introduced. In short, all of the content presented here is related to the probability of calculating random variables. Here we have to introduce the concept of a probability density function, for discrete random variables, each possible value of the variable will correspond to a probability, and in the continuous case, each point may be 0, that is, the probability of 0, and the random variable in an interval of the value is likely to occur, so the derivative of the continuous random variable de-probability , which is the probability density function. After preliminary understanding of probability distribution, probability theory also discusses some important properties of probability distribution, such as expectation and side, which have important effect in practical application. The above discussion is based on the premise of knowing the probability distribution of random variables (especially the probability density function of continuous random variables), however, in practical application we may not know its specific probability distribution. Sometimes we know the form of probability distributions (polynomial distributions, Gaussian distributions, etc.), which are known by their probability density functions, which are often obtained by only a few parameters (such as the mean and variance of the Gaussian distribution), therefore, the method of parameter estimation is introduced and the specific density function of probability distribution under the known distribution form is obtained. So, what should we do after the probability distribution of this random variable? It is sometimes determined by the model itself (for example, the distribution of n-coins is the Bernoulli distribution), sometimes some research results (for example, the large-number center theorem tells us stories that obey the Gaussian distribution).
The definition of probability theory still does not have a unified statement so far, but it is more common to have: empirical probability (e.g. 80% chance of rain tomorrow) frequency probability (e.g. coin toss has the same likelihood of positive and negative); there is also a probability theory based on 3 axioms that is the basis of our present probabilities. The so-called three axioms are as follows:
- < Span class= "Mrow" id= "mathjax-span-9487" style= ">0 ≤ ( a ) ≤ 1
- P (Ω) = 1
- Axiom of addition
In the study of probability theory, my most hated content (not one) is in the classical overview of the permutation combination to solve the occurrence of a probability part of an event. At the beginning of the study of the probability of the first time in the trouble of the arrangement of the combination to beat dead, permutation of the various topics of variability is a regular to follow, you can divide this topic into several different types, to understand, you can solve such problems. The problem, however, is that the meaning of permutation in the probability theory is that it is a way of classical approximation to solve the possibility of event occurrence, that is, probability, if it is arranged in combination this obstacle to the gate of probability theory, can only sigh. I was also because of this place on the probability of the depth of disgust, but this study, in the understanding of the role of permutations here, understand the basic theorem, decisive first ignore it, but to understand the more interesting behind the content (this for me, not much use, so do not want to spend too much time here, To use time on the edge, hehe).
The study probability theory has to mention is the great "Bayesian formula", after reading "Pattern Recognition and machine learning", let greatly feel this formula importance in the probability theory of learning process was a stroke. The general probability of the book in the Bayesian formula will generally give a very classic example: the test results of a certain disease is positive, the person who was tested the probability of the actual illness is how much. In the study of probability theory, we almost have done such a topic, according to the conditions given in the topic, corresponding to each of the Bayesian formula, thus "success" of the solution problem. It is also important to note that the denominator of the Bayesian formula needs to be calculated using the full probability formula. However, the mechanical understanding of Bayesian formula is a problem, but the real meaning of Bayesian formula is neglected. The Bayesian formula provides us with a way to change the likelihood of the original event by adding new knowledge, and here we will write a separate chapter on the Bayesian formula and its application.
In short, probability theory is a very interesting learning, here is only a summary of the content of probability theory, and more in-depth study, looking forward to have the opportunity to summarize the content of my interest in different topics.
Summary of probability theory learning (road map)