Prepare to summarize a few notes about Markov Chain Monte Carlo.
This series of notes is mainly translated from the link given under a Gentle Introduction to Markov Chain Monte Carlo (MCMC) article.
Monte Carlo Approximationsmonte Carlo approximation for Integration theory section
The Monte Carlo method is used to approximate integrals, and the integral can be calculated by numerical methods: The simplest approximation method is to calculate the area of the trapezoid by the small moment. However, when the dimension of the function is too high and the number of variables is too many, the numerical method is not applicable, and the Monte Carlo method calculates the integral approximation from the point of view of probability.
We often need to calculate the following form of integration:
\ (I = \int_{a}^b h (x) g (x) dx\)
For some functions \ (f (X) \) on random variables \ (x\), the expected value can be calculated by the following formula:
\ (E[x] = \int_{p (x)}p (x) f (x) dx\)
where \ (p (x) \) is the density function of the variable \ (x\), it should have an integral of 1 on the defined field.
In fact, if \ (g (x) \) satisfies the following conditions:
1. Non-negative on the interval \ ((b) \)
\ (g (x) \geqslant0, X\in (A, b) \)
2. Limited points on the interval
\ (\int_a^b g (x) =c<\infty\)
Then, \ (g (x) \) can be converted to a density function
\ (P (x) =\frac{g (x)}{c}\)
The original integral is guaranteed to be constant by multiplying the corresponding \ (c\) before the integral number. So what is the benefit of modifying the density function? The advantage of modifying the density function is that the approximate value of the original integral can be calculated by randomly simulating a variable \ (x\) that conforms to the density distribution:
According to the theorem of large numbers:
\ (e[f (x)] = \int f (x) p (x) dx \approx \frac{1}{s}\sum_{n=1}^s f (x_s) \)
where \ (X_s\sim p (x) \), this is critical, that is, the generated random number must conform to the density function of \ (p (x) \) Distribution, the only way to calculate the average value is the integral approximation.
To make a theoretical analysis of this method, two theorems are given below:
large number theorem for a known distributed random sequence, when the sampling number tends to infinity, its mean value tends to expect.
In fact, we often do this in our daily life, such as the expectation of a certain grade in the first grade, we can randomly select some students to sampling tests, using their average score to approximate the grade's performance expectations, the more students selected, the more the average value is closer to the real expectations.
In the statistical context, \ (A (n) \) is a consistent estimate of \ (b\) (Consistent estimator), if satisfied in \ (n\) tends to infinity, \ (A (n) \) converges to \ (b\), that is to say: to any probability \ (p[0<p<1] \), any value \ (\delta\), we can find \ (k\), so that on any \ (n>k\), \ (A (n) \) and \ (b\) the gap in \ (\delta\) the probability of greater than \ (p\).
Central limit theorem a large number of independent random variables and approximate obey the standard normal distribution. Independent distribution (I.I.D) random variable \ (z_i\), the mean value is \ (\mu\), the variance is \ (\sigma^2\):
\ (\frac{\sum\limits_{i=1}^n Z_i-n\mu}{\sigma\sqrt{n}}\sim\mathcal{n} (0,1) \ as\ N\rightarrow \infty\)
Therefore, the error of the Monteclot method is:
Can be seen:
1. If the variance of \ (f (x) \) is limited, then the MC estimate is consistent
2. The error of MC estimation is gradual and unbiased
3. The error of MC estimation is approximate to normal distribution
4. The standard deviation of the MC estimation error is \ (\sigma=\frac{\sqrt{var[f (x)]}}{\sqrt{n}}\)
At the same time, it can be found that MC estimates the dimensions of the data of natural immunity, no matter how large the dimensions, only need to generate random variables according to the distribution function, calculate the function values of the random variables, calculate the mean value of these functions, you can get the estimation of the integral. Due to the smaller the standard deviation of the error, the more accurate the estimation is, so we can increase the accuracy of MC estimation by increasing the number of samples (n\). Another way to increase accuracy is to reduce the variance of \ (f (x) \), which is called variance-reducing techniques (i) do not (i) do (see) (Understand).
Example 1. The calculation of Pi by Monte Carlo method
In the usual tutorial, this example will say: the probability that the randomly generated random points in the square fall into the circle is Balabala, so the area of the circle is balabala than the square area, so pi is Balabala. This method is good to understand, but how to strictly use the formula to express the process?
The area of the circle can be obtained using the following points:
Where internal conditions are met \ (X^2+y^2\leq r^2\) is 1, when the internal conditions are not met 0. That is, when the point is inside the circle, the function value is 1, and the point is outside the circle when the function value is 0. (p (x) \), \ (P (y) \) conforms to the uniform distribution on \ ([-r,r]\), so \ (p (x) =p (y) =\frac{1}{2r}\).
So, according to Monte Carlo integrals, \ (I = \int_{a}^b f (x) g (x) dx\), where the \ (f (x) \) translates to a two-dimensional function equivalent to \ (1 (x^2+y^2\leq r^2) \), \ (g (x) \) equals 1, we convert it to a two-dimensional density function, That is \ (p (x, y) =p (x) \cdot p (y) =\frac{1}{4r^2}=\frac{g (x, y)}{4r^2}\).
So
That is, only need to generate the-R to R evenly distributed horizontal and vertical coordinates, and then bring in \ (1 (x^2+y^2\leq r^2) \) Determine whether it is within the circle, in the words of 1, not recorded as 0, plus and after dividing by the total number of points, calculate the probability of where ball to the circle, and finally multiplied by \ (4r^2\) Find the area of the circle, know the area of the circle, but also know the radius, the natural can also find the approximate value of pi.
% radious of Circler = 1; s = 10000;% Generate s points uniformly distributed pointsx = Unifrnd (-r,r,s,1); y = Unifrnd (-r,r,s,1);% The points falled into the Circlefxy = (x.^2+y.^2-r^2<=0), area = 4*r^2/s*sum (fxy);p i_estimate = area/r^2;% Visualizationfigure;inside = Fxy Outside = logical (1-FXY) scatter (x (inside), Y (inside), [R], ' filled '), hold On;scatter (x (outside), Y (outside), ' B ' ', ' filled '); axis Equal;xlim ([ -1,1]); Ylim ([ -1,1]);
2. Approximate calculation of the integral of \ (xe^x\) by Monte Carlo method
We want to calculate \ (I = \int_0^1 xe^xdx\), by the partial integral formula, not difficult to get its integral value of 1. The approximate value is obtained by Monte Carlo method to see if it is close to 1.
You can think of \ (f (x) \) as \ (xe^x\), \ (g (x) \) as 1, because the integration interval length is also 1, so the probability distribution of the density function is \ (P (x) =\frac{gx}{\int_0^1 g (x) dx}=1\).
Therefore, according to the above discussion, only need to obtain the random number evenly in 0 to 1, then bring in \ (f (x) =xe^x\) to calculate the function value, and finally the mean value can be approximated by the integral.
The code is as follows:
RNG (' default '),% sampleS1 = 100;x1 = Unifrnd (0,1,s1,1); y1 = X1.*exp (x1); I1 = SUM (y1)/s1;% 10000 samplesS2 = 10000;X2 = Unifrnd (0,1,s2,1); y2 = X2.*exp (x2); I2 = sum (y2)/s2;
3. Calculate the expectations of the beta distribution
About the beta distribution, detailed information please visit Baidu Library, here only one point: The beta distribution contains two parameters, a, B. For random variables \ (X\sim Beta (A, b) \), \ (E (X) =\frac{a}{a+b}\).
To calculate the expectations for the Beta distribution, use the following formula:
\ (E (x) =\int_{p (x)}p (x) xdx\), where \ (X\sim Beta (\alpha_1\alpha_2) \)
1. First we find that \ (f (x) =x\), and \ (p (x) \) itself is the probability density function, so no conversion is necessary.
2. We generate a large number of random points based on probability distributions \ (p (x) =beta (\alpha_1,\alpha_2) \) and then bring in \ (f (x) =x\) to find their mean, that is, the expectation of the Beta distribution.
Rand (' Seed ', 12345); alpha1 = 2; ALPHA2 = 10; N = 10000;x = Betarnd (alpha1,alpha2,1,n);% generates 10,000 points by Beta distribution MONTE CARLO EXPECTATIONEXPECTMC = mean (x);% expected approximate value calculated by MC method% A Nalytic EXPRESSION for BETA meanexpectanalytic = alpha1/(alpha1 + alpha2);%beta Distribution's theoretical expectation% Displayfigure;bins = linspace (0, 1,50);% divides the [0,1] interval into 50 parts, the last part is greater than 50 counts = HISTC (X,bins), and the% counts the number of random points falling within each small interval probsampled = Counts/sum (counts); The beta distribution points fall into each small interval corresponding to the probability of probtheory = Betapdf (BINS,ALPHA1,ALPHA2), and the theoretical probability density value of each interval point B = bar (bins,probsampled); ColorMap hot; Hold on;% probability distribution histogram, note that here each vertical bar represents the probability of the corresponding interval, not the probability density. When it is divided by the interval length is the probability density t = Plot (Bins,probtheory/sum (probtheory), ' R ', ' LineWidth ', 2)% here Probtheory/sum (probtheory) Equivalent to PROBTHEORY*DX, because sum (PROBTHEORY*DX) =1, Dx=1/sum (probtheory) m = Plot ([expectmc,expectmc],[0.1], ' g ')% desired approximate value E = Plot ([expectanalytic,expectanalytic],[0.1], ' B ')% desired theoretical value xlim ([expectanalytic-. 05,expectanalytic + 0.05]) legend ([B, t,m,e],{' Samples ', ' theory ', ' $\hat{e}$ ', ' $E _{theory}$ '}, ' interpreter ', ' latex '), title ([' E_{mc} = ', Num2str ( EXPECTMC),‘; E_{theory} = ', Num2str (expectanalytic)]) hold off
We can see that the two vertical lines are very close, which means that the mean values estimated by the samples are very close to the real values.
Monte Carlo approximation for optimization
The Monte Carlo method can also be used to solve optimization problems:
\ (x_{opt}=\arg\max\limits_x Cp (x) \)
If \ (g (x) \) satisfies the previously mentioned two properties: 1. Non-negative 2. Limited points
Then, you can indent it as a density function:
\ (P (x) = \frac{g (x)}{c}\)
where \ (C = \int_x g (x) dx\)
The original optimization problem variant is:
\ (x_{opt}=\arg\max\limits_x Cp (x) \)
The random number is generated by \ (p (x) \) for the probability density function, so the corresponding density function is necessarily the largest in the place where the density of the random number is greatest, and the density function and the objective function are only one constant, so the position is also the largest place of the objective function, that is, the approximate solution of the optimization problem.
Instance
solving optimization problems \ (x_{opt}=\arg\max\limits_x e^{-\frac{(x-4) ^2}{2}}\)
By derivation, the derivative function is zero, and the extremum point can be obtained as x=4.
It can also be observed that the optimized function \ (g (x) \) and the normal distribution only have one parameter: \ (\sqrt{2\pi}\)
The original problem can be deformed as:
where \ (c=\sqrt{2\pi}\).
Therefore, MATLAB can be used to generate a normal distribution random variable with a mean value of 4 and a variance of 1, and then find the corresponding horizontal axis of the point with the largest density function, which is the solution of the optimization problem.
The code is as follows:
% MONTE CARLO optimization of exp (x-4) ^2randn (' Seed ', 12345)% Initialzien = 100000;x = 0:.1:6; C = sqrt (2*PI); g = inline (' exp (-.5* (x-4). ^2) ', ' X ');p h = Plot (X,g (x)/C, ' R ', ' LineWidth ', 3); Hold on% draws the density function gh = Plot (x,g (x), ' B ', ' linewidth ', 2); Hold on;% draw to optimize the function y = normpdf (x,4,1); % CALCULATE MONTE CARLO Approximationx = Normrnd (4,1,1,n), bins = linspace (min (x), Max (x), 100),% generated 100 random interval counts = HISTC (x , bins);% statistics 100 interval each interval falls into the random number of points [~,optidx] = max (counts);% find the subscript xhat = Bins (optidx) of the interval that falls to the most points;% find approximate solution for optimization problem% OPTIMA and Estimated Optimaoh = Plot ([4 4],[0,1], ' K '); hh = Plot ([xhat,xhat],[0,1], ' G '); Set (GCA, ' fontsize ', +) legend ([gh,ph,oh,hh],{' g (x) ', ' $p (x) =\frac{g (x)}{c}$ ', ' $x _{opt}$ ', ' $\hat{x}$ '}, ' Interpreter ', ' latex ', ' location ', ' Northwest ');
It can be seen that the solution of the optimization problem obtained by Monte Carlo algorithm is basically consistent with the real solution.
Monte Carlo Approximations