Probabilistic models

Let $A$ and $B$ be some event in the sample space $\Psi$.

$P(\Psi) = 1$
$P(A) \geq 0$ - probability is nonnegative
$P(A \cup B) = P(A) + P(B)$ if $A$ and $B$ are mutually exclusive

$$P(A|B) = \frac{P(A \cap B)}{P(B)} = \frac{P(AB)}{P(B)} = P(B|A)P(A)$$

$$ P(B|A) = \frac{P(A|B)P(B)}{P(A)} $$

$A$ and $B$ are independent if:

$$ P(A|B) = P(A) $$

$$ P(AB) = P(A)P(B) $$

Which means that knowing $B$ does not give any information about $A$.

$$ f_X(x) dx = P(x \geq X < x + dx) $$

The integral of the PDF must equal $1$:

$$ \int_{-\infty}^{\infty} f_X(x) dx = 1 $$

Consider two random variables $X$ and $Y$.

The conditional PDF is the distribution of one RV when the other is fixed.

$$ f_{X|Y}(x|Y=y_0) = \frac{f_{X,Y}(x,y_0)}{f_Y(y_0)} $$

The above expression is the distribution of $X$ given that $Y=y_0$.

$$ f_X(x) = \int_{-\infty}^{\infty} f_{X,Y}(x,y) dy $$

Random variables $X$ and $Y$ are independent if:

$$ f_{X,Y}(x,y) = f_X(x)f_Y(y) $$

Summary of information

To describe the mean of the joint distribution, we need the means of each random variable in the distribution.

To describe the spread of the joint distribution, we need the marginal variance of each RV as well as the covariance.

We can write two separate random variables $X_1$ and $X_2$ in a matrix:

$$ \mathbf{X} = \begin{bmatrix} X_1 \\ X_2 \end{bmatrix} $$

The expectation is given by:

$$ E[\mathbf{X}] = \begin{bmatrix} \mu_{X_1} \\ \mu_{X_2} \end{bmatrix} = \mathbf{\mu_X} $$

$$ \tilde{\mathbf{X}} = \begin{bmatrix} X_1 - \mu_{X_1} \\ X_2 - \mu_{X_2} \end{bmatrix} = \mathbf{X} - \mathbf{\mu_X} $$

$$ \mathbf{C_{XX}} = E\left[\tilde{\mathbf{X}}\tilde{\mathbf{X}}^T\right] = \begin{bmatrix} \sigma_{X_1X_1} & \sigma_{X_1X_2} \\ \sigma_{X_1X_1} & \sigma_{X_1X_2} \end{bmatrix} $$

$$ f_\mathbf{X}(\mathbf{x}) = \frac{1}{\sqrt{(2\pi)^2 \mathrm{det} \mathbf{C_{XX}}}} \mathrm{exp} \left[ -\frac{1}{2} (\mathbf{X} - \mathbf{\mu_X})^T \mathbf{C_{XX}}^{-1} (\mathbf{X} - \mathbf{\mu_X}) \right] $$

$$ \rho_{XY} = \frac{\sigma_{XY}}{\sigma_X\sigma_Y} $$

Consider two random variables $X_1$ and $X_2$.

We can transform the coordinates from $x_1, x_2$ to a different set of coordinates $z_1, z_2$ by multiplying by a transformation matrix $\mathbf{M}$.

$$ \begin{bmatrix} Z_1 \\ Z_2 \end{bmatrix} = \mathbf{M} \begin{bmatrix} X_1 \\ X_2 \end{bmatrix} $$

The mean/expectation stays in the same location, just with the transformation applied to it, because expectation is a linear operator.

$$ \begin{bmatrix} \mu_{Z_1} \\ \mu_{Z_2} \end{bmatrix} = \mathbf{M} \begin{bmatrix} \mu_{X_1} \\ \mu_{X_2} \end{bmatrix} $$

Use the following relation to find the new covariance matrix:

$$ \tilde{\mathbf{Z}} = \mathbf{M} \tilde{\mathbf{X}} $$

The new covariance matrix $\mathbf{C_{ZZ}}$ is:

$$ \mathbf{C_{ZZ}} = \mathbf{M}\mathbf{C_{XX}}\mathbf{M}^T $$

Let $V=\alpha (X-\beta)$ and $W = \gamma(Y-\delta)$.

For these new variables:

$$ \mu_V = \alpha (\mu_X - \beta) $$ $$ \sigma_V^2 = \alpha^2 \sigma_X^2 $$ $$ \sigma_{VW} = \alpha \gamma \sigma_{XY} $$

The aforementioned correlation coefficient is invariant to shifting and scaling.

Probabilistic models

Conditional probability

Bayes' theorem

Independence

Probability density function (PDF)

Joint distribution

Marginal PDF

Independence

Summary of information

Matrix form

Covariance matrix

Bivariate Gaussian: matrix generalization of the Gaussian

Correlation coefficient

Effect of coordinate transformation

Effect of shifting and scaling

Jaeyoung Wiki