Download Let X ∈ R n×p denote a data matrix with n observations and p

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Matrix multiplication wikipedia , lookup

Gaussian elimination wikipedia , lookup

Cayley–Hamilton theorem wikipedia , lookup

Matrix calculus wikipedia , lookup

Principal component analysis wikipedia , lookup

Ordinary least squares wikipedia , lookup

Non-negative matrix factorization wikipedia , lookup

Transcript
Multivariate Statistics
Thomas Asendorf, Steffen Unkel
Study sheet 5
Summer term 2017
Exercise 1:
Let X ∈ Rn×p denote a data matrix with n observations and p variables with xi =
(xi1 , . . . , xip )> for i = 1, . . . , n. We would like to perform fuzzy clustering to attain K
clusters. Let uik denote the membership of observation i to cluster k and U the membership
matrix, as defined in the lecture. Let vk ∈ Rp denote the cluster centers (k = 1, . . . , K) and
V = (v1 , . . . , vK ) the matrix of all cluster centers. Then we seek to minimize the function:
Jm (U, V) =
n X
K
X
(uik )m ||xi − vk ||2
i=1 k=1
subject to the constraint
can only be attained if
n
P
(a) vk =
PK
k=1 uik
= 1 ∀ i = 1, . . . , n and m ≥ 1. Show that a local optimum
um
ik xi
i=1
n
P
i=1
um
ik
and
(b) uik =
2
K P
dik m−1
j=1
dij
!−1
,
where dik = ||xi − vk ||. Hint: Use Lagrange multipliers to incorporate the constraint.
Exercise 2:
For illustration purposes consider the data set faithful in R, which contains the waiting
time and duration of an eruption of the Old Faithful geyser in Yellowstone National Park,
Wyoming, USA.
(a) Compute a distance matrix D using Euclidean distances between observations.
(b) Use the function fanny() from the package cluster to perform fuzzy clustering on the
data with i. K = 2 and ii. K = 3. Which solution would you prefer?
(c) Perform fuzzy clustering (K = 2) with the function cmeans() from the package e1071.
Compare your results with those obtained when using k-means clustering with the
function kmeans().
Date: 26 May 2017
Page 1
Exercise 3:
Suppose we have a data set of observations X ∈ Rn×p from a mixture of K Gaussian
distributions. Then the log-likelihood of our mixture is given through:
!
n
K
X
X
log (f (X|π, µ, Σ)) =
log
πk Np (xi |µk , Σk )
i=1
k=1
Show that following expressions minimize the given log-likelihood:
(a) µk =
(b) Σk =
(c) πk =
1
nk
1
nk
nk
n
n
P
γ(zik )xi ,
i=1
n
P
γ(zik )(xi − µk )(xi − µk )> ,
i=1
,
P
where γ(zik ) is defined as in the lecture and nk = ni=1 γ(zik ). Hint: You may use Jacobi’s
formula to obtain the derivative of the determinant of Σk and Lagrange multipliers to
incorporate the constraint.
Exercise 4:
Reconsider the data set faithful in R for applying the EM-algorithm.
(a) Implement the EM-algorithm for Gaussian mixtures for the special case of K = 2 and
p = 2.
(b) Run the EM-algorithm from (a) on the Old Faithful data set and visualize your results.
Date: 26 May 2017
Page 2