Keywords: mixtures of Gaussians
A Formal Introduction to Mixtures of Gaussians
We have introduced a mixture distribution in the post ‘An Introduction to Mixture Models’. And the example in that post was just two components Gaussian Mixture. However, in this post, we would like to talk about Gaussian mixtures formally. And it severs to motivate the expectation-maximization(EM) algorithm.
Gaussian mixture distribution can be writen as:
where and .
And then we introduce a random variable(vector) called latent varible(vector) , that each component:
and is a -of- representation, which means there is one and only one component is and others are . To build a joint distribution , we should build and firstly. We define the distribution of , we found:
is a good design, for for meets the requirements of the probability distribution. And for the entire vector equ(3) can be written as:
And according to the definition of we can get the condition distribution of given . Under the condition , we have:
and then we can derive the vector form of condtional distribution:
Once we have both the probability distribution of , and conditional distribution of given , . And we can build joint distribution by multiplication principle:
However, what we concern is still the distribution of . We can calculate by simply:
where is every possible value of random variable
This is how latent variables construct mixture Gaussians. And this form is easy for us to analyze the distribution of a mixture model.
Bayesian formula can help us produce posterior. And the posterior probability of latent varibale by equation (7) can be calculated:
and substitute equation (3),(5) into equation (9) and we get:
And is also called reponsibility, and denoted as:
Bishop, Christopher M. Pattern recognition and machine learning. springer, 2006. ↩︎