Keywords: Bayesian model averaging, BMA and combining models
Bayesian Model Averaging(BMA)
Bayesian model averaging(BMA) is another wildly used method which is very like a combining model. However, the difference between BMA and combining models is significant.
A Bayesian model averaging is a Bayesian formula in which the random variable are models(hypothesises) with prior probability , then the marginal distribution over data is:
And the MBA is used to select a model(hypothesis) that can model the data best through Bayesian theory. When we have a larger size of , the posterior probability
become sharper. Then we got a good hypothesis.
In post ‘Mixtures of Gaussians’, we have seen how a mixture of Gaussians works. Then joint distribution of input data and latent varible is:
and the margin distribution of is
For the mixture of Gaussians:
the latent variable is designed:
for . And is a -of- representation.
Then this mixture of Gaussians is a king of combining models. Each time, only one is selected(for is -of- representation). An example of a mixture of Gaussians, and its original curve is like:
And the latent variables separate the whole distribution into several Gaussian distributions:
This is the simplest model of combining models where each expert is a Gaussian model. And during the voting, only one model selected by makes the final decision.
A combining model method contains several models and predicts by voting or other rules. However, Bayesian model averaging can be used to generate a hypothesis from several candidates.
Bishop, Christopher M. Pattern recognition and machine learning. springer, 2006. ↩︎