Keywords: Generative models
Probabilistic Generative Models
The generative model used for making decisions contains the inference step and decision step:
- Inference step is using probability or other theory to calculate which means to a given the probability of belonging to the class
- Decision step is to make a decision based on which was calculated in step 1
In this post, we just give an introduction and a framework for the probabilistic generative model in classification. And the details of how to estimating the parameters in the model will not be introduced.
From Bayesian Formular to Logistic Sigmoid Function
To build , we can start from Bayesian formula. To the class of a two-classes problem, the posterior probability:
represents a new function:
An usual question is why we set but not . In my opinion, this just determine the graph of function . However, we perfer monotone-increasing function and is just a monotone-increasing function but is not.
is called logistic sigmoid function or squashing function, because it maps any number into interval . The range of the function is just within the range of probability. So it is a good way to represent some kinds of probability, such as the . the shape of the logistic sigmoid function is:
Some Properties of Logistic Sigmoid
For the logistic sigmoid function is symmetrical, then:
So, we have an important equation:
The inverse function of is:
The derivative of logistic sigmoid function is:
Multiple Classes Problems
We, now, extend the logistic sigmoid function into multiple classes condition. And we also start from Bayesian formular:
In this condition,if we set , the whole fomular will be too complecated. To simplify the equation, we just set:
and we get a function of posterior probability:
And according to the property of probability, we get the value of function:
belongs to interval . And it is called sofemax function. Although, according to equation (10), the domain of definition of softmax function is , can be any real number. It’s called softmax, because it is a smooth versoin of max function.
When for , we have:
So both logistic sigmoid function and softmax function can be used to form generative classifiers, which gives a value to the decision step. Because they can generate a probability representing