Robust PCA Study Notes

Last Update:2013-12-08 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Robust PCARachel Zhang 1. RPCA Brief Introduction1. Why use Robust PCA? Solve the problem witheat Ike noise with high magn1_instead of Gaussian distributed noise. 2. main ProblemGiven C = A * + B *, where A * is a sparse spike noise matrix and B * is a Low-rank matrix, aiming at recoveringB *. B * = U Σ V ', in which U ε Rn * k, Σ ε Rk * k, V ε Rn * k 3. difference from pcabth PCA and Robust PCAaims at Matrix decomposition, However, In PCA, M = L0 + N0, L0: low rank matrix; N0: small idd Gaussi An noise matrix, it seeks the best rank-k estimation of L0 by minimizing | M-L | 2 subjectto rank (L) <= k. this problem can be solved by SVD. in RPCA, M = L0 + S0, L0: low rank matrix; S0: a sparse spikes noise matrix, we 'lltry to give the solution in the following sections. 2. conditionsfor correct decomposition4. Ill-posed problem: Suppose sparse matrix A * and B * = eiejTare the solution of this decomposit Ion problem.1) With the assumption that B * is not only low rank but alsosparse, another valid sparse-plus low-rank decomposition might be A1 = A * + eiejT andB1 = 0, Therefore, we need an appropriatenotion of low-rank that ensures that B * is not too sparse. conditionswill be imposed later that require the space spanned by singular vectors U andV (I. e ., the row and column spaces of B *) to be "incohere Nt "with the standardbasis.2) Similarly, with the assumption that A * is sparse as well aslow-rank (e.g. the first column of A * is non-zero, while all other columns are0, then A * has rank 1 and sparse ). another valid decomposition might be A2 = 0, B2 = A * + B * (Here rank (B2) <= rank (B *) + 1 ). thus weneed the limitation that sparse matrix shocould beassumed not to be low rank. i. e ., assume each row/column Does not havetoo defaults non-zero entries (don't exist dense row/column), to avoid such issues. 5. conditions for exact recovery/decomposition: If A * and B * are drawnfrom these classes, then we have exact recovery with high probability [1]. 1) For low rank matrix L --- Random orthogonal model [Candes andRecht 2008]: A rank-k matrix B * with svd B * = U Σ V' is constructed in this way: thesingular vectors U, V Rn * k are drawnuniformly at random from the collection of rank-k partial isometries (offset operator) inRn * k. the singular vectors in U and V need not be mutuallyindependent. no restriction is placed on the singular values.2) For sparse matrix S --- Random sparsity model: The matrix A * such that support (*) is chosen uniformlyat random from the collection of all support sets of size m. there is noassumption made About the values of A * at locations specified by support (*). [Support (M)]: thelocations of the non-zero entries in MLatest [2] improved on the conditions andyields the 'best' condition. 3. recovery Algorithms6. FormulizationFor decomposition D = A + E, in which A is low rank and error E is sparse.1) Intuitively proposeminrank (A) + gamma | E | 0, (1) however, it is non-convex thus intractable (both of these 2a Re NP-hard to approximate ). 2) Relax L0-norm to L1-norm and replace rank with nuclear normmin | A | * + λ | E | 1, where | A | * = Σ I σ I (A) (2) This is convex, I. e ., exist a uniquely minimizer. reason: This relaxation ismotivated by observing that | A | * + λ | E | 1 is theconvex envelop (convex) of rank () + gamma | E | 0 over the set of (A, E) suchthat max (| A | 2, 2, | E | 1, ∞) ≤ 1. moreover, there might be circumstancesunder Which (2) perfectly recovers low-rank matrix A0. [3] shows itis indeed true under surprising broad conditions. 7. approach RPCA Optimization AlgorithmWe approach in twodifferent ways. 1st approach, use afirst-order method to solve the primal problem directly. (E. g. proximalGradient, Accelerated Proximal Gradient (APG), the computational bottleneck ofeach iteration is a SVD computation. 2 ndapproach Is to formulate and solve the dual problem, and retrieve the primalsolution from the dual optimal solution. the dual problem too RPCA canbe written as: maxYtrace (DTY), subject to J (Y) ≤ 1 where J (Y) = max (| Y | 2, λ-1 | Y | ∞ ). | A | x means the x norm of. (infinite norm indicates the largest absolute value in the matrix ). This dual problem can besolved by constrained steepest ascent. now let's talk about Augmented Laplace Multiplier (ALM) and Alternating Directions Method (ADM) [2, 4]. 7.1. general method of ALMFor the optimizationproblemmin f (X), subj. h (X) = 0 (3) we can define the Lagrangefunction: L (X, Y, μ) = f (X) + <Y, h (x)> + μ/2 | h (X) | F2 (4) where Y is a Lagrangemultiplier and μis a positive scalar. general method of ALM is: A genetic Lagrangemultiplier algorithm wocould solve PCP (principle component pursuit) by repeatedlysetting (Xk) = arg min L (Xk, Yk, μ ), and then the Lagrangemultiplier matrix via Yk + 1 = Yk + μ (hk (X) 7.2 ALM algorithm for RPCAIn RPCA, we can define (5) X = (A, E ), f (x) = | A | * + λ | E | 1, andh (X) = D-A-EThen the ragfunctionis (6) L (, e, Y, μ) = | A | * + λ | E | 1 + <Y, d-A-E> + μ/2 | D-A-E | F 2The Optimization Flow isjust like the general ALM method. the initialization Y = Y0 * isinclured by the dual problem (dual problem) as it is likely to make the objective function value <D, Y0 *> reasonably large. theorem 1. for algorithm 4, any accumulation point (A *, E *) of (Ak *, Ek *) is an optimal solution to the RPCA problemandthe convergence rate is at least O (μK-1 ). [5] Inthis RPCA algorithm, a iteration strategy is adopted. as the optimizationprocess might be confused, we impose 2 well-known facts: (7) (8) S ε [W] = arg minX ε | X | 1 + X-W | F2 (7) u s ε [W] VT = arg minX ε | X | * + X-W | F2 (8) which is used in the abovealgorithm for optimization one parameter while fixing another. in thesolutions, S ε [W] is the soft thresholding operator. here I will impose the problemto speculate this. (To facilitate writing formulation and ploting, I use amanuscript .) now we utilize formulation (7,8) into RPCA problem. for the objective function (6) w. r. t get optimal E, wecan rewrite the objective function by deleting unrelated component into: f (E) = λ | E | 1 + <Y, d-A-E> + μ/2 | D-A-E | F 2 = λ | E | 1 + <Y, d-A-E> + μ/2 | D-A-E | F 2 + (μ/2) | μ-1Y | 2 // add an irrelevant item w. r. t E = λ | E | 1 + (μ/2) (2 (μ-1Y · (D-A-E )) + | D-A-E | F 2 + | μ-1Y | 2) // try to transform into (7)'s form = (λ/μ) | E | 1 + threads | E-(D-A-μ-1Y) | F2Finally we get the form of (7) and in the optimizationstep of E, we haveE = S λ/μ [D-A-μ-1Y], same as what mentioned in algorithm 4. similarly, for matrices X, we can prove A = US1/μ (D-E-μ-1Y) V is theoptimization process of. 8. reference: 1) E. j. candes and B. recht. exact Matrix Completion Via ConvexOptimization. submitted for publication, 2008.2) E. j. candes, X. li, Y. ma, and J. wright. robust PrincipalComponent Analysis Submitted for publication, 2009.3) Wright, J ., ganesh, ., rao, S ., peng, Y ., ma, Y.: Robustprincipal component analysis: Exact recovery of specified upted low-rank matrices viaconvex optimization. in: NIPS 2009.4) X. yuan and J. yang. sparse and low-rank matrix decompositionvia alternating direction methods. preprint, 2009.5) Z. lin, M. chen, L. wu, and Y. ma. the augmented Lagrangemultiplier method for exact recovery of a specified upted low-rank matrices. mathematical Programming, submitted, 2009.6) Generalized Power method for Sparse Principal Component Analysis

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Robust PCA Study Notes

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Robust PCA Study Notes

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support