Download Compressive (or Compressed) Sensing “Detecci´ on Comprimida”

Compressive (or Compressed) Sensing “Detección Comprimida” Javier Peña Carnegie Mellon University UN Encuentro de Matemáticas Universidad Nacional, July 2012 Materials available at http://andrew.cmu.edu/user/jfp/UNencuentro 1 / 32 Compressive Sensing, Lecture 1 Consider the following two pictures One of these is a raw jpg 2.7MB file. The other one is a compressed jpg 300KB version. Can you tell them apart? 2 / 32 Wavelets and images Compressibility corresponds to and sparsity in a suitable basis. Wavelets Images wavelet coeffs (sorted) 16000 1000 14000 900 12000 800 10000 700 8000 600 6000 500 4000 400 2000 300 0 200 !2000 !4000 100 0 2 4 6 8 10 12 0 0 2 4 6 8 10 5 ⇓ 3000 12 5 x 10 x 10 0 !0.5 2000 !1 1000 !1.5 !2 0 !2.5 !1000 !3 1 megapixel image !3.5 !2000 !3000 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5 7 !4 0 2 4 6 8 10 zoom in 12 5 x 10 4 x 10 (log10 sorted) 3 / 32 Compression • Take original 1 megapixel image Approximatio WaveletWavelet Approximation • Compute all 1 million wavelet coefficients • Keep only the 25K largest and set the others to zero • Invert the wavelet transform 0 0 !1 !1 !2 !2 !3 !3 !4 !4 !5 2 !5 4 original image1 megapixel 25Kapprox approximation 1 megapixel image image25k term 25k term approx B-term a 4 / 32 Going against a long established tradition? Conventional signal acquisition paradigm Acquire/Sample (A-to-D converter, digital camera) Compress (signal dependent, nonlinear) sample ! ! N sparse wavelet transform receive Fundamental question Question ! N >> M compress M transmit/store ! M decompress ! N Can we acquire the useful partcan of the If directly the signal is just compressible, it signal? be acquired efficiently? 5 / 32 Compressive sensing • Term coined by Donoho • Body of theory and algorithms for sparse signal acquisition and recovery • Seminal papers: • Candès, Romberg and Tao (2006) • Candès and Tao (2006) • Donoho (2006) • Hot area of research spanning information theory, signal processing, statistics, mathematics, etc. • Applications where measurements are • slow or costly (MRI) • missing or wasteful • beyond other capabilities such as memory 6 / 32 Plan • Introduction to compressive sensing, undetermined systems of equations, `1 minimization • Main theoretical and computational techniques • Matrix completion, undetermined linear matrix equations, nuclear norm minimization. 7 / 32 Undetermined systems of equations Problem Recover a signal x̄ ∈ Rn from m n linear measurements bk = hak , x̄i, k = 1, . . . , m b = Ax̄ • In general this is impossible. • Suppose we know that x̄ is sparse. Does that help? Example Suppose only one component of x̄ is different from zero. Can we get by with fewer than n measurements? 8 / 32 Possible approach to recover sparse x̄ Take m n measurements b = Ax̄ and then solve min kxk0 Ax = b Here k · k0 stands for the `0 quasi-norm: kxk0 = |{i : xi 6= 0}|. Relevant questions • Does this work (provided we take enough measurements)? • Suppose x̄ is k-sparse, i.e., kx̄k0 = k. How many measurements suffice? • How hard is it to solve the above `0 -minimization problem? 9 / 32 `0 versus `1 -optimization min kxk0 Ax = b computationally hard min kxk1 Ax = b computationally tractable (linear program) • `1 norm is the convex envelope of the `0 quasi-norm. • The `1 minimization problem is a convex relaxation of the `0 minimization problem. • In many cases the above `1 minimization problem yields the same solution as the `0 problem. 10 / 32 A bit of history of `1 optimization `1 minimization often finds the “right” answer • Seismology • Lasso regression • Bandlimited deconvolution • Total variation (TV) denoising • Basis pursuit 11 / 32 Acquiring a sparse signal Suppose x̄ ∈ Rn is s-sparse. • Take m random and nonadaptive measurements bk = hak , x̄i, k = 1, . . . , m b = Ax̄ • Try to reconstruct x̄ via `1 minimization. First fundamental result If m & s · log n and the ak are suitably chosen, then the recovery is exact. 12 / 32 Nonadaptive sensing of compressible signals Classical approach Compressive sensing • Measure the full signal x̄ • Take m random • Store s largest coefficients • Reconstruct x̂ via `1 (all coefficients) • Distorsion kx̄ − x̄s k2 measurements minimization Second fundamental result If m & s · log n then kx̂ − x̄k2 . kx̄ − x̄s k2 13 / 32 Optimality of compressive sensing It is not possible to do better • with fewer measurements • with other reconstruction algorithms Key features of compressive sensing • Obtain compressible signals from few sensors • Sensing is nonadaptive: no knowledge about the signal • Simple acquisition followed by `1 decoder 14 / 32 Next: formal statements • Probabilistic approach • Isotropy and incoherence • Incoherent sampling theorems • Deterministic approach • Restricted isometry property • Signal recovery theorem • Robustness to noise • Optimality 15 / 32 Probabilistic approach: random sensing Acquire x̄ ∈ Rn by measuring bk = hak , x̄i, k = 1, . . . , m b = Ax̄ where ak are iid F for some distribution F in Cn . Two key properties • Isotropy: E(aa∗ ) = I • Coherence measure µ(F ): smallest number such that max |ha, ei i|2 ≤ µ(F ) with high probability i=1,...,n We want low coherence µ(F ). 16 / 32 Coherence Observe • E(aa∗ ) = I implies µ(F ) ≥ 1. • We would like µ(F ) to be as close as possible to 1 (incoherent sensing). Examples of isotropic incoherent sensing • Gaussian sensing • Binary sensing • Partial Fourier transform Notation: for a ∈ Cn and i = 1, . . . , n a[i] := ha, ei i. 17 / 32 Isotropic incoherent sensing Gaussian sensing a ∼ N (0, I ), that is, a[1], . . . , a[n] are iid N (0, 1). In this case µ(F ) = log n. Binary sensing a[1], . . . , a[n] are iid with distribution P(a[i] = ±1) = 1/2. In this case µ(F ) = 1. Partial discrete Fourier transform • Select k uniformly at random in {0, 1, . . . , n − 1} • Set a[t] := e i2πkt/n , t = 0, 1 . . . , n − 1. In this case µ(F ) = 1. 18 / 32 An example of coherent sensing Sample random components of x • Select j uniformly at random in {1, . . . , n} • Set a = √ nej . In this case E(aa∗ ) = I and max |a[i]|2 = n. i=1,...,n For this type of sensing, how many samples do we need to recover a 1-sparse vector with high probability? 19 / 32 Incoherent sampling theorems Compressive sampling approach measure bk = hak , x̄i, k = 1, . . . , m b = Ax̄ perform `1 recovery x̂ := argmin{kxk1 : Ax = b} x Theorem (Candès & Plan) Assume x̄ is s-sparse. Then recovery is exact with probability at least 1 − 5/n − e −β provided that m ≥ C0 · (1 + β) · µ(F ) · s · log n for some constant C0 . 20 / 32 Related predecessors: Theorem (Candès & Tao) Assume x̄ is s-sparse and sample m Fourier coefficients selected at random. Then recovery is exact with probability at least 1 − O(n−β ) provided that m ≥ Cβ · s · log n for some constant Cβ that depends only on the desired accuracy β. This result is optimal: any reliable recovery method would require at least s · log n samples. Theorem (Candès & Tao) Assume x̄ ∈ Rn is s-sparse, n is prime, and we sample m Fourier coefficients. Then x̄ can be reconstructed from the m samples if m ≥ 2s. 21 / 32 Sampling of non-sparse signals Given x ∈ Rn and s < n define xs := argmin kz − xk2 kzk0 ≤k Theorem (Candès & Plan) Let x ∈ Rn , β > 0 and s̄ be such that m ≥ Cβ · s̄ · log n. Then with probability at least 1 − 6/n − 6e −β the `1 solution x̂ satisfies kx̂ − xk1 ≤ min C · (1 + α) · kx − xs k1 1≤s≤s̄ for some constant C and s (1 + β)sµ(F ) log n log m log2 s α= . m 22 / 32 Deterministic approach: restricted isometry Restricted isometry property (RIP) Given A ∈ Rm×n and k ∈ {1, . . . , m}, the k-isometry constant δk is the smallest δ ≥ 0 such that (1 − δ)kxk22 ≤ kAxk22 ≤ (1 + δ)kxk22 for all k-sparse x ∈ Rn . If δk < 1, we say that A satisfies the RIP with constant δk . 23 / 32 Signal recovery with a RIP matrix Compressive sampling approach measure bk = hak , x̄i, k = 1, . . . , m b = Ax̄ perform `1 recovery x̂ := argmin{kxk1 : Ax = b} x Theorem (Candès, Romberg, Tao) Assume δ2s ≤ √ 2 − 1. Then the solution x̂ satisfies kx̂ − x̄k1 ≤ C · kx̄ − x̄s k1 and kx̂ − x̄k2 ≤ C · kx̄ − x̄s k1 √ s for some constant C . 24 / 32 Matrices that satisfy RIP With high probability an m × n matrix A satisfies the RIP in the following cases: • m & s log(n/s) and A is Gaussian • m & s log(n/s) and A is binary • m & s log4 n and A is partial DFT • m & µ(F )s log4 n and rows of A are iid F . 25 / 32 Sampling with noise • Measurements in real life are generally noisy • More appropriate model b = Ax̄ + z noise term z ∼ N (0, σ 2 I ) • Assume all columns of A have Euclidean norm equal to one. • Modify `1 minimization to account for noise Lasso min 1 kb − Axk22 + λ · σ · kxk1 2 Dantzig selector min kxk1 kA∗ (b − Ax)k∞ ≤ λ · σ 26 / 32 Noise aware recovery (random sensing) Theorem (Candès & Plan) Let x̄ ∈ Rn , β > 0 and s̄ be such that m ≥ Cβ · s̄ · log n. Then with probability at least 1 − 6/n − 6e −β the √ solution x̂ to the Lasso or the Dantzig selector with λ = 10 log n satisfies ! r log n kx̂ − x̄k1 ≤ min C · (1 + α2 ) · kx̄ − x̄s k1 + sσ 1≤s≤s̄ m for some constant C . 27 / 32 Optimality of Compressive Sensing Back to Gaussian sensing m Gaussian measurements and `1 decoding: kx̂ − xk2 . kx − xs k1 m √ . , s≈ log(n/m) s Question Can we do better with other measurements or other algorithms? 28 / 32 Signals with power law & !" % !" $ !" # !" ! !" " !" !! !" " !" image ! !" # !" $ !" % !" & !" ' !" sorted wavelet coefficients Power law decay: |x|(1) ≥ |x|(2) ≥ · · · ≥ |x|(n) |x|(k) ≤ C kp Model `p ball Bp := {x : kxkp ≤ 1}. Discuss case p = 1 but same discussion applies to 0 ≤ p ≤ 1. 29 / 32 Recovery of `1 ball B Gaussian sensing • Suppose unknown vector is in B • Take m Gaussian measurements r kx̂ − xk2 . log(n/m) + 1 . m Ideal sensing Best we can hope from m linear measurements: Em (B) = inf sup kx − D(F (x))k. D,E x∈B 30 / 32 Gelfand widths Theorem (Donoho) dm (B) ≤ Em (B) ≤ C · dm (B), where dm (B) is the m-width of B: dm (B) := inf sup kPV xk2 : codim(V ) < m V x∈B Theorem (Kashin, Garnaev-Gluskin) For `1 ball r C1 · log(n/m) + 1 ≤ dm (B) ≤ C2 · m r log(n/m) + 1 . m Compressive sensing achieves the limits of performance. 31 / 32 References for today’s material • Slides for this minicourse: http://andrew.cmu.edu/user/jfp/UNencuentro • E. Candès & Y. Plan, “A probabilistic and RIPless theory of compressed sensing,” IEEE Trans. Inf. Theory, vol 57, no. 11, pp. 7235–7254, Nov. 2011. • E. Candès, J. Romberg, & T. Tao, “Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information,” IEEE Trans. Inf. Theory, vol 52, no. 2, pp. 489–509, Feb. 2006. • E. Candès & T. Tao, “Decoding by linear programming,” IEEE Trans. Inf. Theory, vol 51, no. 12, pp. 4203–4215, Dec. 2005. • D. Donoho, “Compressed sensing,” IEEE Trans. Inf. Theory, vol 52, no. 4, pp. 1289–1306, April 2006. • Papers on compressive sensing: http://dsp.rice.edu/cs 32 / 32

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Compressive (or Compressed) Sensing “Detecci´ on Comprimida”