This is an old revision of the document!
Methods of estimation
Given some data, we may want to estimate the parameter(s) of the true probability distribution they come from. There are three methods: plugin, feature matching, and maximum likelihood.
Plugin estimator
For the plugin estimator, simply plug in the data, weighting each data point by its associated probability.
Mean
$$ \mu = \mathbb{E}[X] $$
$$ \hat{M} = \frac{1}{n} \sum_{i=1}^{n} X_i = \hat{\mathbb{E}}[X] $$
Variance
$$ v = \mathbb{E}\left[\left(X - \mathbb{E}[X] \right)^2 \right] $$
$$ \hat{V} = \frac{1}{n} \sum_{i=1}^{n} (X_i - \mu)^2 $$
Median
$$ a = \mathrm{median}(\mathbb{P}) $$
$$ \hat{A} = \mathrm{median}(\hat{\mathbb{P}}) $$
Feature matching
A feature is a property of a distribution, including but not limited to mean, variance, and median.
The goal of feature matching is to make an estimate for the parameter(s) of the distribution so that the feature(s) of the distribution match the features of the data.
For a given probability distribution $\mathbb{P}$ with parameter $\theta$, we can extract feature(s) $h^\theta = g(\mathbb{P}^\theta)$. We can also calculate the features for the empirical distribution $\hat{h} = g(\hat{\mathbb{P}})$. Then solve for $\theta$ by setting $h^\theta = \hat{h}$.
For example, find the median $\hat{h}$ for sampled data, and find $\theta$ such that the median of the estimated distribution $h^\theta$ is equal to $\hat{h}$.