Download Smoothing/Natural Splines and Penalized Likelihood Regression data

Smoothing/Natural Splines and Penalized Likelihood Regression data: (xi, Yi), a < x1 < · · · < xn < b Yi = µ(xi) + i var(i) = σ 2 1X min (Yi − µ(xi))2 + λP (µ) µ smooth n where λ > 0 is a smoothing parameter and P = penalty. EXAMPLE: (smooth.spline). Minimize 1X (Yi − µ(xi))2 + λ n Z (µ00)2 R 00 2 0 Minimize over all µ with µ absolutely continuous, (µ ) < ∞. (µ0 absolutely continuous means µ00 exists almost everywhere and R t 00 0 0 s µ (x) dx = µ (s) − µ (t) for all s < t.) FACTS FOR THIS EXAMPLE: • lines are reproduced: Yi = a + bxi ⇒ µ̂(x) = a + bx • As λ → ∞, µ̂ converges to the least squares line. • If λ = 0, µ̂ interpolates the data. • µ̂ is piecewise cubic, with joins at the data points. µ̂ has a continuous second derivative and µ̂00(x) = 0 for x ∈ [a, x1) ∪ (xn, b]. EXAMPLE: smooth.Pspline. Minimize 1X (Yi − µ(xi))2 + λ n Z (µ(m))2 R (m) 2 (m−1) Minimize over all µ with µ absolutely continuous, (µ ) < ∞. (µ(m−1) absolutely continuous means µ(m) exists almost everywhere and R t (m) (x) dx = µ(m−1)(s) − µ(m−1)(t) for all s < t.) sµ FACTS FOR THIS EXAMPLE: • degree (m − 1) polynomials are reproduced • As λ → ∞, µ̂ converges to the least squares degree m − 1 polynomial. • If λ = 0, µ̂ interpolates the data. • µ̂ is piecewise polynomial of degree 2m − 1 with 2m − 2 continuous derivatives, and µ̂(m)(x) = 0 for x ∈ [a, x1) ∪ (xn, b]. That is: THEOREM. µ̂ = a natural spline of degree 2m−1 with knots at x1, . . . , xn RECALL: a spline of degree 2m − 1 is a piecewise polynomial of degree 2m − 1 with 2m − 2 continuous derivatives. DEFINITION of a natural spline f is a natural spline of degree 2m − 1 with (interior) knots at x1, . . . , xn if f is a spline of degree 2m − 1 and f (m) ≡ 0 on [a, x1) ∪ (xn, b]. DIMENSION OF NATURAL SPLINE SPACE is n REFERENCES: Grace Wahba, Chong Gu, Bernard Silverman ... IMPLEMENTED IN R: smooth.spline (2nd derivative penalty) smooth.Pspline (general integer mth derivative penalty) R commands: smooth.spline, predict.smooth.spline, derivatives too smooth.spline(x, y=NULL, w=NULL, df, spar=NULL, cv=FALSE, all.knots=FALSE, nknots=NULL, df.offset = 0, penalty = 1, control.spar=list()) predict(object, x, deriv = 0, ...) Choose λ (spar) by CV, GCV (no plug-in) 1 (Y − µ(x ))2 + λ (µ(m) )2 IN THEORY: TO MINIMIZE n i i • old-fashioned: calculus or calculus of variations • elegant: functional analysis with inner products, projections, Reproducing Kernel Hilbert Spaces (Chong Gu’s book, Grace Wahba’s book) P R Heuristics of minimization when m = 1: Minimize b 1X 2 (Yi − µ(xi)) + λ (µ0(x))2 dx n a Z (1) µ̂ is a line on (xi, xi+1) and µ̂ is constant on [a, x1) ∪ (xn, b]. (2) Find â1 ≡ µ̂(x1), . . . , ân ≡ µ̂(xn). (2) Find â1 ≡ µ̂(x1), . . . , ân ≡ µ̂(xn). 2 [µ̂(x ) − µ̂(x )] i i−1 (µ̂0)2 = (xi − xi−1) × (slope)2 = 2≤i≤n xi − xi−1 xi−1 Z x i Write     µ(x2) − µ(x1) µ(x1)    ... .    = D  ..  , µ(xn) − µ(xn−1) µ(xn) Λ = diag 1 xi − xi−1 ! So we minimize: 1 1 (Y − µ)0(Y − µ) + λ(Dµ)0Λ(Dµ) = (Y − µ)0(Y − µ) + λµ0D0ΛDµ n n ⇒ µ̂ = (I + nλD0ΛD)−1Y ≡ SλY GENERAL SMOOTHING SPLINE CASE: 1 P(Y − µ(x ))2 + λ R (µ(m) )2 minimize n i i ⇒ µ̂ is a natural spline. P So for some basis φ1, . . . , φn, µ̂(x) = n 1 β̂j φj (x). So we minimize, as a function of β1, . . . , βn, n X X 1X 2 βj φj (xj )] + λ βj βk [Yi − n 1 j,k Z (m) φj (m) (x)φk (x) dx 1 = ||Y − Φβ)||2 + λβ 0P β n ⇒ Ŷ = Φβ̂ = Φ(Φ0Φ + nλP )−1Φ0Y ≡ SλY. MODIFICATIONS/EXTENSIONS THIN PLATE SPLINES: Xi ∈ <d. Minimize 1X n (Yi − µ(Xi))2 + λ Z Implemented in Chong Gu’s gss ∂2 ∂x2 1 µ + ··· + ∂2 ∂x2 d µ !2 dx1 · · · dxd MODIFICATIONS/EXTENSIONS PARTIAL LINEAR MODEL: data: (Yi, xi, zi) with E(Yi) = βzi + µ(xi) µ smooth, β and µ unknown. Minimize 1X (Yi − βzi − µ(xi))2 + λ n Z (µ00)2. Implemented in gam (yes), in gss (??) COMMENT: gcv/cv chooses λ that is good for predicting Y ∗ = βz + µ(x) + . In general, this λ is good for estimating µ but is too big for estimating β. MODIFICATIONS/EXTENSIONS PENALIZED LIKELIHOOD. Minimize − log likelihood (µ(x1), . . . , µ(xn)) + λ Z (µ(m))2 ⇒ natural spline of degree 2m − 1 with knots at x1, . . . , xn. EXAMPLE: Suppose that Yi = 1, 0. P{Yi = 1} = p(xi). log likelihood = n X " log p(xi)Yi (1 − p(xi))1−Yi # 1 = n X " # Yi log p(xi)/(1 − p(xi) + log[1 − p(xi)] 1 Set µ(xi) = log [p(xi)/(1 − p(xi))] . MODIFICATIONS/EXTENSIONS OTHER PENALTIES (Ramsay and Heckman) Minimize Z 1X 2 (Yi − µ(xi)) + λ (Lm)2 n where L is an mth order linear differential operator. (Lµ = Solution is a generalized type spline. Pm (j) ) α µ j 0 Examples: R R 00 2 2 • (Lµ) = (µ ) : don’t penalize a line R R 1 00 2 • (Lµ) = 0 (µ + (2π)2µ)2: don’t penalize µ(x) = a0 cos(2πx) + a1 sin(2πx) R R 1 00 2 • (Lµ) = 0 (µ − βµ0)2: don’t penalize µ(x) = a0 + a1 exp(βx) (can estimate β) DENSITY ESTIMATION (Silverman, 1982) Estimate a density f based on data X1, . . . , Xn iid ∼ f . log likelihood = n X log f (Xi) 1 Remove the positivity constraint by letting g = log f and maximize n 1X g(Xi) − λ n 1 Z (Lg)2 subject to Z exp(g(x)) dx = 1. Maximizer exists provided MLE exists within the parametric class of densities: {f : L(log f ) ≡ 0} Silverman showed that this is equivalent to maximizing n 1X g(Xi) − λ n 1 Z (Lg)2 + Z exp(g(x)) dx.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Smoothing/Natural Splines and Penalized Likelihood Regression data