Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
kb:estimation_methods [2022-02-12 18:22] – [Feature matching] jaeyoung | kb:estimation_methods [2024-04-30 04:03] (current) – external edit 127.0.0.1 | ||
---|---|---|---|
Line 32: | Line 32: | ||
A feature is a property of a distribution, | A feature is a property of a distribution, | ||
- | The goal of feature matching is to | + | The goal of feature matching is to make an estimate for the parameter(s) of the distribution so that the feature(s) of the distribution match the features of the data. |
+ | For a given probability distribution $\mathbb{P}$ with parameter $\theta$, we can extract feature(s) $h^\theta = g(\mathbb{P}^\theta)$. We can also calculate the features for the empirical distribution $\hat{h} = g(\hat{\mathbb{P}})$. Then solve for $\theta$ by setting $h^\theta = \hat{h}$. | ||
+ | |||
+ | ==== Method of moments ==== | ||
+ | |||
+ | Moments of distributions are commonly used as features for feature matching. The $k$-th moment of a random variable $X$ is $\mathbb{E}[X^k]$. | ||
+ | |||
+ | To estimate the moment from empirical data $X_1, ... X_n$, replace expectation with the average: | ||
+ | |||
+ | $$ \hat{\mathbb{E}}[X^k] = \frac{1}{n} \sum_{i=1}^n X_i^k $$ | ||
+ | ===== Maximum likelihood estimator ===== | ||
+ | |||
+ | Assume a probability mass or distribution function with parameter(s) $\theta$. Given a set of data points $ X = (X_1, ..., X_n) $, the likelihood function is the product of the PMFs of all of the points for a discrete distribution, | ||
+ | |||
+ | Discrete (PMF): | ||
+ | |||
+ | $$ L^\theta(x_1, | ||
+ | |||
+ | Continuous (PDF): | ||
+ | |||
+ | $$ L^\theta(x_1, | ||
+ | |||
+ | ==== Log-likelihood ==== | ||
+ | |||
+ | It is usually easier to maximize the log of the likelihood function, known as the log-likelihood function. This is of course equivalent to maximizing the likelihood function. | ||
+ | |||
+ | Discrete (PMF): | ||
+ | |||
+ | $$ \max_\theta \sum_{i=1}^n \log \mathbb{P}^\theta (X_i = x_i) $$ | ||
+ | |||
+ | Continuous (PDF): | ||
+ | |||
+ | $$ \max_\theta \sum_{i=1}^n \log f_{X_i}^\theta (x_i) $$ |