Differences

This shows you the differences between two versions of the page.

--- kb:estimation_methods [2022-02-12 18:22] – [Feature matching] jaeyoung
+++ kb:estimation_methods [2024-04-30 04:03] (current) – external edit 127.0.0.1
@@ Line 32: / Line 32: @@
 A feature is a property of a distribution, including but not limited to mean, variance, and median.
-The goal of feature matching is to
+The goal of feature matching is to make an estimate for the parameter(s) of the distribution so that the feature(s) of the distribution match the features of the data.
+For a given probability distribution $\mathbb{P}$ with parameter $\theta$, we can extract feature(s) $h^\theta = g(\mathbb{P}^\theta)$. We can also calculate the features for the empirical distribution $\hat{h} = g(\hat{\mathbb{P}})$. Then solve for $\theta$ by setting $h^\theta = \hat{h}$.
+==== Method of moments ====
+Moments of distributions are commonly used as features for feature matching. The $k$-th moment of a random variable $X$ is $\mathbb{E}[X^k]$.
+To estimate the moment from empirical data $X_1, ... X_n$, replace expectation with the average:
+$$ \hat{\mathbb{E}}[X^k] = \frac{1}{n} \sum_{i=1}^n X_i^k $$
+===== Maximum likelihood estimator =====
+Assume a probability mass or distribution function with parameter(s) $\theta$. Given a set of data points $ X = (X_1, ..., X_n) $, the likelihood function is the product of the PMFs of all of the points for a discrete distribution, or the product of the PDFs of all of the points for a continuous distribution.
+Discrete (PMF):
+$$ L^\theta(x_1, ..., x_n) = \prod_{i=1}^{n} \mathbb{P}^\theta (X_i = x_i) $$
+Continuous (PDF):
+$$ L^\theta(x_1, ..., x_n) = \prod_{i=1}^{n} f_{X_i}^\theta (x_i) $$
+==== Log-likelihood ====
+It is usually easier to maximize the log of the likelihood function, known as the log-likelihood function. This is of course equivalent to maximizing the likelihood function.
+Discrete (PMF):
+$$ \max_\theta \sum_{i=1}^n \log \mathbb{P}^\theta (X_i = x_i) $$
+Continuous (PDF):
+$$ \max_\theta \sum_{i=1}^n \log f_{X_i}^\theta (x_i) $$