1.SVM final optimization problem for even function
2. Caching of kernel functions
Because the matrix is a symmetric matrix, the footprint in memory can be M (m+1)/2
The mapping relationship is:
[CPP]View PlainCopy
- #define  OFFSET (x, y) ((x) > (y) ? ((((((x) +1) * (x) >> 1) + (y) : (((y) +1) * (y) >> 1) + (x))
- // ...
- for (unsigned i = 0; i < count; ++i)
- for (unsigned j = 0; j <= &NBSP;I;&NBSP;++J)
- cache[offset (i, j)] = y[i] * y[j] * kernel (X[i], x[j], dimision);
- //...
3. Solving gradients
Since the alpha value is a variable, the alpha value is derivative and the alpha value is then selected to be optimized by the gradient.
Gradient:
[CPP]View PlainCopy
- for (unsigned i = 0; i < count; ++i)
- {
- Gradient[i] =-1;
- For (unsigned j = 0; J < count; ++j)
- Gradient[i] + = Cache[offset (i, J)] * Alpha[j];
- }
Shut Up w Max, when α is reduced, g the bigger the better. Conversely, g the smaller the better.
4. Constraints of the Sequence minimization method (SMO)
Each time a 2 α value is selected for optimization, the other alpha values are treated as constants, depending on the constraint conditions:
After the optimization:
5. Make selection Rules
Because the range of α is in the interval [0,c], α is bound by α
If the selected and the different number, that is, λ=-1, then the same as the increase and decrease
Assume
If so, you should now select
The above propositions can be translated into (note: and equivalence)
If the selected and the same number, that is, λ=1, then and the increase and decrease of the differences
If, then, it should be selected at this time,
The above propositions can be translated into (note: and equivalence)
Or
The above conclusions can be collated (for the sake of simplicity here only the symbol before G and the symbol of y is different)
[CPP]View PlainCopy
- unsigned x0 = 0, x1 = 1;
- Optimized alpha value based on gradient selection
- {
- Double gmax =-dbl_max, gmin = Dbl_max;
- For (unsigned i = 0; i < count; ++i)
- {
- if ((Alpha[i] < C && Y[i] = = POS | | alpha[i] > 0 && y[i] = NEG) &&-y[i] * Gradient[i] > Gmax)
- {
- Gmax =-y[i] * Gradient[i];
- x0 = i;
- }
- Else if ((Alpha[i] < C && Y[i] = = NEG | | alpha[i] > 0 && y[i] = POS) &&-y[i] * GRA Dient[i] < gmin)
- {
- Gmin =-y[i] * Gradient[i];
- x1 = i;
- }
- }
- }
6. Start the solution
Alpha requires that the alpha value of the non-conforming condition be adjusted within the interval [0,c], with the following adjustment rules.
In 2 cases, if λ=-1, namely:
After substituting:
[CPP]View PlainCopy
- if (y[x0]! = y[x1])
- {
- Double coef = cache[offset (x0, x0)] + cache[offset (x1, x1)] + 2 * cache[offset (x0, X1)];
- if (coef <= 0) Coef = dbl_min;
- Double delta = (-gradient[x0]-gradient[x1])/coef;
- double diff = alpha[x0]-alpha[x1];
- ALPHA[X0] + = Delta;
- ALPHA[X1] + = Delta;
- unsigned max = x0, min = x1;
- if (diff < 0)
- {
- max = x1;
- min = x0;
- diff =-diff;
- }
- if (Alpha[max] > C)
- {
- Alpha[max] = C;
- Alpha[min] = C-diff;
- }
- if (Alpha[min] < 0)
- {
- Alpha[min] = 0;
- Alpha[max] = diff;
- }
- }
If λ=1, that is:
[CPP]View PlainCopy
- Else
- {
- Double coef = cache[offset (x0, x0)] + cache[offset (x1, x1)]-2 * cache[offset (x0, X1)];
- if (coef <= 0) Coef = dbl_min;
- Double delta = (-gradient[x0] + gradient[x1])/coef;
- double sum = alpha[x0] + alpha[x1];
- ALPHA[X0] + = Delta;
- ALPHA[X1]-= Delta;
- unsigned max = x0, min = x1;
- if (alpha[x0] < alpha[x1])
- {
- max = x1;
- min = x0;
- }
- if (Alpha[max] > C)
- {
- Alpha[max] = C;
- Alpha[min] = sum-c;
- }
- if (Alpha[min] < 0)
- {
- Alpha[min] = 0;
- Alpha[max] = sum;
- }
- }
Then adjust the gradient, adjust the formula as follows:
[CPP]View PlainCopy
- for (unsigned i = 0; i < count; ++i)
- Gradient[i] + = Cache[offset (i, x0)] * delta0 + cache[offset (i, X1)] * DELTA1;
7. Calculation of weights
The calculation formula is as follows:
[CPP]View PlainCopy
- Double Maxneg =-dbl_max, Minpos = Dbl_max;
- SVM *SVM = &bundle->svm;
- for (unsigned i = 0; i < count; ++i)
- {
- Double WX = Kernel (svm->weight, data[i], dimision);
- if (y[i] = = POS && minpos > WX)
- Minpos = WX;
- Else if (y[i] = = NEG && Maxneg < WX)
- Maxneg = WX;
- }
- Svm->bias =-(Minpos + Maxneg)/2;
Code Address: http://git.oschina.net/fanwenjie/SVM-iris/
Process analysis of SMO algorithm for support vector machine