Mathematical Foundations of Machine Learning
Robert Nowak
Conditional probability
Defined as the solution to the equation ) or, rearranged,
In this way, we can think of conditional probabilities as a mathematical object rather than a more intuitive “words” explanation.
Marginal probability
To marginalize is to sum over all possible values of a variable. Thus, if the joint density in question is , the marginal probability of a random variable is Marginalization is often useful when you need to remove a random variable from consideration. By marginalizing above, drops out of the equation.
Expected value
The expected value of a function of a random variable is Note that this works for any function – you just evaluate the function at each potential realization and multiply that by the probability of that realization, then sum it all up. You can also use this for things like conditional expectations, where the “function” is one of the variables and the probability is the conditional probability: