Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
CRMs Sinead Williamson Background Lévy processes Completely random measures Completely random measures and related models Applications Normalized random measures Neutral-to-theright processes Exchangeable matrices Sinead Williamson Computational and Biological Learning Laboratory University of Cambridge January 20, 2011 Outline CRMs Sinead Williamson 1 Background Background Lévy processes Completely random measures Applications Normalized random measures Neutral-to-theright processes Exchangeable matrices 2 Lévy processes 3 Completely random measures 4 Applications Normalized random measures Neutral-to-the-right processes Exchangeable matrices A little measure theory CRMs Sinead Williamson Set: e.g. Integers, real numbers, people called James. Background May be finite, countably infinite, or uncountably infinite. Lévy processes Completely random measures Applications Normalized random measures Neutral-to-theright processes Exchangeable matrices Algebra: Class T of subsets of a set T s.t. T ∈T. If A ∈ T , then Ac ∈ T . If A1 , . . . , AK ∈ T , then ∪K k=1 Ak = A1 ∪ A2 ∪ . . . AK ∈ T (closed under finite unions). 4 If A1 , . . . , AK ∈ T , then ∩K k=1 Ak = A1 ∩ A2 ∩ . . . AK ∈ T (closed under finite intersections). 1 2 3 σ-Algebra: Algebra that is closed under countably infinite unions and intersections. A little measure theory CRMs Sinead Williamson Background Lévy processes Completely random measures Applications Normalized random measures Neutral-to-theright processes Exchangeable matrices Measurable space: Combination (T , T ) of a set and a σ-algebra on that set. Measure: Function µ between a σ-field and the positive reals (+ ∞) s.t. 1 2 µ(∅) = 0. For all countable collections P of disjoint sets A1 , A2 , · · · ∈ T , µ(∪k Ak ) = k µ(Ak ). Probability measures CRMs Sinead Williamson Background Lévy processes Completely random measures Applications Normalized random measures Neutral-to-theright processes Exchangeable matrices Probability distribution: Measure P on some measurable space (Ω, F) s.t. P(Ω) = 1. Intuition: Subsets = events; measures of subsets = probability of that event. Discrete probability distribution: assigns measure 1 to a countable subset of Ω. Continuous probability distribution: assigns measure 0 to singletons x ∈ Ω. Atoms: singletons with positive measure. Representing the real world CRMs Sinead Williamson Background Lévy processes Completely random measures Applications Normalized random measures Neutral-to-theright processes Exchangeable matrices Kolmogorov: Two types of object - experimental observations, and the random phenomena underlying them. Real world Random phenomena Mathematical world Probability space (Ω, F, P) Experiment Algebra Experimental observations Collection of random variables Representing the real world CRMs Sinead Williamson Background Lévy processes Completely random measures Applications Normalized random measures Neutral-to-theright processes Exchangeable matrices Random variables X : (Ω, F) → (SX , SX ) are mappings from the underlying probability space to our observation space. This mapping, combined with the probability distribution on (Ω, F), induces a probability distribution µX := P ◦ X −1 on the observation space. We call µX the distribution of our observations. X SX Ω Characteristic functions CRMs Sinead Williamson Background Lévy processes Completely random measures Applications Normalized random measures Neutral-to-theright processes Exchangeable matrices Often, it is useful to represent random variables and probability distributions in terms of their characteristic function. For a random variable X taking values in Rd with distribution µX , Z ΦX (u) = e ihuy i µX (dy ) = E[e ihuy i ] Rd If µX admits a density (i.e. µX (dy ) = p(y )ν(dy )), then the characteristic function is the Fourier transform of that density. Infinitely divisible distributions CRMs Sinead Williamson Background Lévy processes We say a probability measure µ is infinitely divisible if, for each n ∈ N: Completely random measures We can write µ as the n-fold self-convolution µ(n) ∗ · · · ∗ µ(n) of some distribution µ(n) . Applications (Equivalently) The nth root Φ(n) of the characteristic function of µ is the characteristic function of some probability measure. Normalized random measures Neutral-to-theright processes Exchangeable matrices (Equivalently) For any X ∼ µ, we can write P X = ni=1 X (i) , where X (i) ∼ µ(n) . (The celebrated) Lévy-Khintchine formula CRMs Sinead Williamson Background Lévy processes Completely random measures Applications Normalized random measures Neutral-to-theright processes Exchangeable matrices Theorem: Lévy-Khintchine A distribution µ on Rd is infinitely divisible iff its characteristic function Φµ can be represented in the form: 1 Φµ (u) = exp ihb, ui − hu, Aui 2 Z + (e ihu,zi − 1 − ihu, ziI (|z ≤ 1))ν(dz, ds) , (Rd −{0})×SX for some uniquely defined vector b ∈ Rd , positive-definite symmetric matrix A, and measure ν on Rd satisfying: Z (|z|2 ∧ 1)ν(dz, ds) < ∞ . Rd −{0}×SX Notation CRMs Sinead Williamson Background Lévy processes We call: Completely random measures b the drift; Applications A the Gaussian covariance matrix; Normalized random measures Neutral-to-theright processes Exchangeable matrices ν the Lévy measure; the triplet (A, ν, b) the generating triplet. Lévy processes CRMs Sinead Williamson A Lévy process is a stochastic process X = (Xt )t≥0 s.t. Background 1 X0 = 0. Lévy processes 2 X has independent increments, i.e. for each n ∈ N and each t1 ≤ · · · ≤ tn+1 , the random variables (Xti+1 − Xti , 1 ≤ i ≤ n) are independent. 3 X is stochastically continuous, i.e. for every > 0 and s ≥ 0, lim P(|Xt − Xs | > ) = 0 . (1) Completely random measures Applications Normalized random measures Neutral-to-theright processes Exchangeable matrices s→t 4 Sample paths of X are right-continuous with left limits. A Lévy process is homogeneous if its increments are stationary – i.e. if the distribution of Xt+s − Xt does not depend on t. Lévy processes and infinite divisibility CRMs Sinead Williamson Background Lévy processes Theorem: Infinite divisibility Completely random measures Xt is infinitely divisible for all t ≥ 0. Applications Proof Normalized random measures Neutral-to-theright processes Exchangeable matrices (Homogeneous case) Since X has independent increments, we can write Xt as the sum of n independent random variables for any n ∈ N. Therefore, Xt is infinitely divisible. Lévy processes and infinite divisibility CRMs Sinead Williamson Background Lévy processes Completely random measures Applications Normalized random measures Neutral-to-theright processes Exchangeable matrices Infinite divisibility means the Lévy-Khintchine formula holds. So, we can describe a Lévy process in terms of a drift vector, a Gaussian covariance matrix and a Lévy measure. A related result - the Lévy-Itô decomposition, tells us that any Lévy process can be decomposed into the superposition of three Lévy processes: A continuous, deterministic process, governed by the drift. A continuous, random process (Brownian motion), governed by the Gaussian covariance matrix. A pure-jump, random process, governed by the Lévy measure. Subordinators CRMs Sinead Williamson Background Lévy processes Completely random measures Applications Normalized random measures Neutral-to-theright processes Exchangeable matrices A subordinator is a Lévy process with strictly increasing sample paths. A Lévy process on R+ has increasing sample paths iff: A = 0 ← no Gaussian component. b R ≥ 0 ← deterministic component is strictly nondecreasing. ν(dz × R+ ) = 0 ← no negative jumps. R(−∞,0) zν(dz × R+ ) < ∞ ← ensures conditions of Lévy (0,1] process. If 0 < ν < ∞, then X has countably infinite jumps. Completely random measures CRMs Sinead Williamson Background Lévy processes Completely random measures Applications Normalized random measures Neutral-to-theright processes Exchangeable matrices Random measure: Mapping M : (Ω, F) → (SM , SM ), where (SM , SM ) is a set of measures. Completely random measure (CRM): Random measures where (SM , SM ) is a set of measures such that µ(A1 ) and µ(A2 ) are independent whenever A1 and A2 are disjoint. CRMs can be decomposed into three parts: An atomic measure with random atom locations and random atom masses. 2 An atomic measure with (at most countable) fixed atom locations and random atom masses. 3 A non-random measure. 1 Parts 2 and 3 can be easily dealt with, so we only consider part 1. Completely random measures and Lévy processes CRMs Sinead Williamson Background Lévy processes Completely random measures Applications Normalized random measures Neutral-to-theright processes Exchangeable matrices CRM: Distribution over measures that assign independent masses to disjoint subsets. This distribution is infinitely divisible, so Lévy-Khintchine applies. CRMs are closely related to Lévy processes: If X is a subordinator, then the measure M defined so M(t, s] = Xt − Xs is a CRM. If M is a completely random measure on R+ , then it’s cumulative function is a subordinator. Just as a subordinator (with ν > 0) has a countably infinite number of jumps, a CRM assigns positive mass to a countably infinite number of locations: M= ∞ X i=1 where πi > 0 for all i. πi δti , Completely random measures and Poisson processes CRMs Sinead Williamson Background Lévy processes Completely random measures Applications Normalized random measures Neutral-to-theright processes Exchangeable matrices Can catgorize atoms as (size, location) pairs in some space R+ × SX . Define a Poisson point process on this space with Lévy measure ν(dz, ds). Events of Poisson point proces give size and location of atoms of CRM. Homogeneous CRM ↔ ν(dz, ds) = νz (dz)νs (ds). Example: Gamma process CRMs Sinead Williamson Background Lévy processes Completely random measures Applications Normalized random measures Neutral-to-theright processes Exchangeable matrices Let H be a measure over some space (SX , SX ). Distribution over measures such that the mass assigned to a given subset A ∈ S is distributed according to Gamma(c, αH(ds)), c, α > 0. Such a distribution is a CRM with Lévy measure ν(dz, ds) = αe −cz dzH(ds) . z Normalized random measures CRMs Sinead Williamson Background Lévy processes Completely random measures Applications Normalized random measures Neutral-to-theright processes Exchangeable matrices Completely random measures are distributions over measures with random (finite) total measure. In Stats and ML, we are often interested in probability measures. Obvious solution: Normalize! Example: Dirichlet process = normalized Gamma process. Example: Normalized stable process. Survival analysis CRMs Sinead Williamson Background Lévy processes Completely random measures Applications Normalized random measures Neutral-to-theright processes Exchangeable matrices Objective: Estimate distribution over time T at which a specified event occurs for a given individual. Examples: Deaths of patients in a study. Failure times of mechanical components. Time at which a user leaves a website. Observations: Observe individuals i = 1, . . . , n over time. Record times Ti = ti ∈ R+ at which events occur. Right-censoring: Each individual i is observed over some time interval [0, ci ]. If Ti > ci , the event is unobserved (censored) for individual i. Representing distribution over event times CRMs Sinead Williamson Background Lévy processes Completely random measures Applications Normalized random measures Neutral-to-theright processes Exchangeable matrices Cumulative distribution R t function F (t) = P(T < t) = 0 f (u)du. f (t) Hazard rate h(t) = 1−F (t) . Rt Cumulative hazard (def. 1): H(t) = 0 h(u)du. Cumulative hazard (def. 2): A(t) = −log (1 − F (t)). Definitions coincide if the cdf is continuous. 2 1.8 1.6 CDF Hazard rate Cumulative hazard 1.4 1.2 1 0.8 0.6 0.4 0.2 0 0 5 10 time 15 Neutral-to-the-right processes CRMs Sinead Williamson Background Lévy processes Completely random measures Applications Normalized random measures Neutral-to-theright processes Exchangeable matrices Doksum (1974): A random distribution function F (t) is neutral-to-the-right if, for each k > 1 and t1 < · · · < tk , the normalised increments F (t2 ) − F (t1 ) F (tk ) − F (tk−1 ) F (t1 ), ,··· , 1 − F (t1 ) 1 − F (tk−1 ) are independent. Doksum (1974): F (t) is neutral-to-the-right iff its cumulative hazard (def. 2) is the cumulative function of a completely random measure. Hjort (1990): F (t) is neutral-to-the-right iff its cumulative hazard (def. 1) is the cumulative function of a completely random measure. In both cases, F (t) is conjugate under observed and right-censored observations (Ferguson and Phadia, 1979; Hjort, 1990). Example: Beta process CRMs Sinead Williamson Background Lévy processes Completely random measures Applications Normalized random measures Neutral-to-theright processes Exchangeable matrices CRM with Lévy measure ν(dz, ds) = c(s)z −1 (1 − z)c(s)−1 dzH(ds) , where c is a non-negative, p/w continuous function and H is a (def. 2) hazard function. Note: Lévy measure depends on atom location (inhomogeneous). Discrete measure with atom masses in (0, 1). Intuition: Infinitesimal limit of beta-distributed atom masses. Survival analysis intuition: Atom location = time. Atom size = probability of event at that time, given survival until that time. Application: Exchangeable matrices CRMs Sinead Williamson Background Lévy processes Completely random measures Applications Normalized random measures Neutral-to-theright processes Exchangeable matrices A sequence is exchangeable if any permutation of that sequence has equal probability. de Finetti: There exists an underlying measure, conditioned on which, the sequence is iid. Recipe for exchangeable distribution: Combine a distribution over measures with an appropriate (*cough* conjugate) likelihood. Example: Dirichlet process + “multinomial” distribution → Chinese restaurant process. Application: Exchangeable matrices CRMs Sinead Williamson Background Lévy processes Completely random measures Applications Normalized random measures Neutral-to-theright processes Exchangeable matrices We can use CRMs to define exchangeable distributions over matrices with infinite columns. Each column corresponds to an atom of the CRM-distributed measure. Beta process + Bernoulli likelihood → Indian Buffet process (Griffiths and Ghahramani, 2005) Gamma process + Poisson likelihood → infinite gamma-Poisson process (Titsias, 2007) 5 4 6 3 5 4 4 4 2 5 3 4 2 3 3 1 4 2 2 2 4 0 1 2 1 0 0 3 1 2 0 2 0 1 2 0 0 1 2 0 0 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0