In order to understand the motive of introducing the concept of "mixed strategy" into game theory, we look at the result of solving the fairly simple "guessing game" with "Underline method", the result is as shown in Figure 8.3.1.
The answer is that, in the sense of pure strategy, "guessing game" has no solution, that is, there is no Nash equilibrium in the sense of pure strategy, that is, this game is not a balanced and stable outcome. But experience tells us, two children play this kind of guessing game, a bureau difficult to decide, play again and again, randomly out of a finger, or two fingers, many times later, the basic victory and half, that is, a balanced result. The implication is that if a game does not have a balanced ending in a purely strategic sense, each of the two bureaus will randomly organize all of their own strategies and may achieve a balanced outcome, in other words, a Nash equilibrium may exist in the meaning of a probabilistic strategy. It is this thinking that has led to the concept of "hybrid strategy".
A mixed strategy
1. Definition of a hybrid policy
Set Ⅰ and Ⅱ is a game of two people in the Bureau. Their pure policy set (c8.2) is written separately:
S={S1,S2,... SN} and T={t1,t2, ... TM} (8.3.1)
X and y are two probability vectors, namely:
x= (x1,x2, ... xn) t;xi≥0 (i=1,2, ... n); ∑xi = 1
Y= (Y1,y2, ... ym) t;yj≥0 (j=1,2, ... m); ∑yj = 1
If x represents a probability choice of the whole strategy of the pure strategy set S of the human Ⅰ in the game; Y represents a probabilistic configuration of the whole strategy of the pure policy set T of a person in the Ⅱ, namely:
Ⅰ chooses the strategy S1 with the probability X1, chooses the Strategy s2 with the probability x2, ... Select the policy sn in probability xn.
Ⅱ chooses the strategy T1 with the probability Y1, chooses the strategy T2 with the probability y2, ... Select the policy tm in probability ym.
It is called X as the mixed strategy of the local human Ⅰ, and Y is the ⅱ of the local human.
The practical meaning of mixed strategy is to express the preference degree of the people in the Bureau to each pure strategy, or the probability estimate of each pure strategy choice which achieves the equilibrium outcome of multiple games, thus embodying the meaning of the subjective probability.
2. Mixed policy Set
Based on the definition of mixed strategy, it is easy to see that pure policy is a special mixed strategy. For example, a purely strategic strategy for the Ⅰ of the si∈s is a special hybrid Strategy X ': The component values of this probability vector are:
X ' I=1, X ' J=0 (j≠i)
That is, the Ⅰ selection strategy S1 probability of 0 (may be set i≠1), ... The probability of choosing a strategy Si is 1, ... The probability of selecting a policy sn is 0 (you may want to set i≠n). With this insight, in the following article, we will remember:
x={x∈rn| x= (x1,x2, ... xn) t;xi≥0 (i=1,2, ... n); ∑xi = 1}; (8.3.2)
Y={y∈rm| y= (y1,y2, ... ym) t;yj≥0 (j=1,2, ... m); ∑yj = 1}. (8.3.3)
and said: X is the Ⅰ policy set or mixed policy set of the local human. Y is the policy set or mixed policy set for the Ⅱ of the Bureau. and (x, y) ∈xxy is the mixed strategy outcome of the game.
Note that pure policy set S is a finite set, and the convex set generated by it, that is, the simplex (see chapter II for the content) can be expressed as:
It is visible that the mixed policy set x corresponds to the convex set (simplex) 1-1 that is generated by the pure policy set S (mathematically isomorphic), therefore, the mixed policy set X "can be considered" as the convex set (simplex) extended by the pure policy set S, and set S is the pole subset of Set X. In the same vein, the mixed policy set Y can be considered as a convex set (simplex) extended by a pure policy set T, and set T is a subset of the poles of set Y. With this understanding, it is not difficult to grasp the concept of a hybrid strategy in which each hybrid strategy x represents a strategy that is produced by a convex combination of all pure policies si∈s.
3. Profit function of mixed strategy outcome
The pure policy set S and T of the Ⅰ and Ⅱ of game, and their mixed set X and y are defined by type (8.3.1), type (8.3.2) and type (8.3.3) respectively. The profit matrix model of the game is:
The profit matrix we define for the Ⅰ of the Bureau is:
The profit matrix for the definition of the Ⅱ of the Bureau is:
The profit function that defines a mixed policy outcome is as follows:
(1) to take Si∈s, y∈y, define the outcome (SI, Y) The Profit function is:
(8.3.4)
(2) The profit function of tj∈t, any x∈x, and the definition of outcome (X, TJ) is:
(8.3.5)
(3) to take x∈x, y∈y, define the outcome (X, y) the Profit function is:
(8.3.6)
(8.3.7)
The definition of the U1 (X,y) given by the formula (8.3.6) is analyzed in relation to the definition of U1 (si,y) given (8.3.4), and the definition of the U2 (X,y) given by the formula (8.3.7) is analyzed in relation to the definition of U2 (X,TJ) given by the (8.3.5), and it is easy to conclude that U1 (x , y) and U2 (X,y) have the following equivalent expressions:
(8.3.9)
(8.3.10)
Nash equilibrium of mixed strategy
(i) The concept of hybrid strategy Nash equilibrium
1, the definition of hybrid strategy Nash equilibrium
The pure policy set S and T of the Ⅰ and Ⅱ of game, and their mixed set X and y are defined by type (8.3.1), type (8.3.2) and type (8.3.3) respectively.
If the outcome of a mixed strategy (x, y) ∈xxy satisfies the following conditions:
(1) (8.3.11)
(2) (8.3.12)
Then the result of the mixed strategy (x, y) is Nash equilibrium.
2, the meaning of mixed strategy Nash equilibrium
Because the mixed policy set X is considered to be a convex set (simplex) that expands with a pure policy set S as a subset of Poles. Therefore, according to the properties of the function (called convex function) defined on the convex set, it can be proved that if the formula (8.3.11) is established, the next form is bound to be established:
(8.3.13)
Similarly, if the formula (8.3.12) is established, the next form is bound to be established:
(8.3.14)
(8.3.13) and formula (8.3.14). X is the Ⅱ of the people's Ⅰ in the Bureau, the optimal strategy after the choice of strategy Y (the most profitable), and Y is the optimal strategy after the ⅱ of the People Ⅰ in the Bureau (the condition is the most profitable).
Because in the game, the Ⅰ and the human Ⅱ all choose "rational" action, so that the game between the two sides will be in the outcome (x, y) to achieve a balanced state.
[Example 8.3.1] verifies that a mixed-strategy outcome (x, y) consisting of x= (1/2, 1/2) T, y= (1/2, 1/2) T is a Nash equilibrium of the guessing game.
The model of solving "guessing game" is:
Pure policy Set s={1,2} (that is, {out of one finger, out of two fingers}), pure policy set t={1,2}. By type (8.3.4), type (8.3.5), type (8.3.6) and type (8.3.7).
The following inequalities are therefore established
By the type (8.3.8) and the formula (8.3.8), the mixed endings (x, y) are the Nash equilibrium of "guessing game".
(b) The method of seeking Nash equilibrium in "2 strategy game"
[Theorem 8.3.1]
If the game of the local human Ⅰ and ⅱ their own pure strategy set S and T are 2 policy sets:
S={S1,S2} and T={t1,t2}
Then the mixed policy outcome (x, y) is a sufficient and necessary condition for Nash equilibrium:
(8.1.15)
(8.1.16)
Proof can be set:
The necessity of first proof, set (x, y) is Nash equilibrium. By Type (8.3.9)
By the meaning of Nash equilibrium, the hybrid strategy x is the optimal strategy of the Ⅰ in predicting the people's Ⅱ choice y in the game, so by:
That
Re-sufficiency, set U1 (S1,y) =u1 (s2,y), then
Equally available:
Thus the Nash equilibrium definition (8.3.11) and the formula (8.3.12) are known (x, y) as Nash equilibrium.
[Example 8.3.2] to find the Nash equilibrium of the game 8.3.2 given by the graph.
Solution by formula (8.3.4)
The solution of the requirement of the formula (8.3.15)
2q-1=0
∴q=1/2
By Type (8.3.5)
The solution of the requirement of the formula (8.3.16)
So Nash equilibrium is obtained
Two applications of mixed strategy Nash equilibrium
We introduce "surveillance game" and "joint investment game" to understand the typical application of mixed strategy Nash equilibrium
(a) Monitoring game
1, the model of monitoring game
Agents working for clients, there are two strategies to choose from: Work (W) and lazy (S). Suppose that work makes the generation
The Merchant spends G, thus obtains the client pays his salary W (W>g is a reasonable assumption, otherwise the agent does not have any work enthusiasm). There are also two pure strategies for the client to monitor: check (I) and do not check (N). If the client checks the cost of H, at this price to Exchange agents are lazy information. Once the agent is found to be lazy, the deduction of wages as punishment, if the agent to work without laziness, then the client will add value V of the property (obviously V>w). If this information is common knowledge, two of the people in the Bureau of complete information static game. Further, it may be assumed that g>h>0, that is, grasp the main contradiction, ignoring the secondary situation, to simple discussion. The profit matrix for this game is shown in Figure 8.3.3.
2. To find out the Nash equilibrium of the game, and to seek the reference value of the principal's salary payable to the agent.
(1) To find the Nash equilibrium of the game of graph monitoring.
First of all, we try to find the pure strategy Nash equilibrium with the underline method, the result is as shown in Figure 8.3.3, it is obvious that there is no pure strategy Nash equilibrium in the surveillance game. The following theorem 8.3.1 to seek the mixed strategy Nash equilibrium.
by U1 (w,y) =u1 (s,y), get:
(1-q) w=w-g (8.3.17)
In fact (8.3.17) the left side is the agent lazy when the expectation of profit, and the right side is the agent when the expectation of profit. therefore (8.3.17), in the Nash equilibrium, the client's hybrid strategy y must make the agent's choice between work and laziness an indifferent attitude due to the equal average profit. Solution (8.3.17) type, get:
Q=G/W (8.3.18)
Again by
(8.3.19)
(8.3.20)
by U2 (X,i) =u2 (x,n):
(8.3.21)
Similarly (8.3.21), the agent in the Nash equilibrium of the mixed strategy X, must make the client in the choice of inspection or not to check the indifferent attitude. Solution (8.3.21) type, get:
P=H/W (8.3.22)
To sum up, we get the mixed strategy solution of the monitoring game, namely the mixed strategy Nash equilibrium:
((h/w,1-h/w), (g/w,1-g/w))
(2) Determine the reference value of the principal's wages payable to the agent.
The expected profit from (8.3.10), as well as (8.3.19) and (8.3.20) clients are:
(8.3.23)
Substituting p=h/w and q=g/w (8.3.23), that is, to obtain the expected profit of a client in Nash equilibrium: