Wiener filter
Unconstrained Wiener filter
Wiener filtering allows us to estimate the value of one wide-sense stationary random process from measurements of another WSS random process that is jointly WSS.
Essentially, a Wiener filter is an LMMSE estimator that uses the values of one process to estimate the values of the other. This can be written in the form:
$$ \hat{y} [n] = \mu_y + \sum_{j = 0}^{L-1} h[j] \underbrace{(x[n-j] - \mu_x)}_{\tilde{x}[n-j]} $$
This is equivalent to $x$ convolved with $h$, an FIR filter:
$$ \hat{y}[n] = (h \ast x)[n] $$
This FIR filter $h[\cdot]$ satisfies:
$$ (h \ast C_{xx})[m] = C_{yx}[m], \forall m $$
In the frequency domain, this can be rewritten as:
$$ H(e^{j\Omega}) = \frac{D_{yx}(e^{j\Omega})}{D_{xx}(e^{j\Omega})} $$
This is the frequency response of the unconstrainted Wiener filter - that is, $x[n]$ for all $n$ can be used.
Mean square error of unconstrained Wiener filter
$$ \frac{1}{2\pi} \int_{-\pi}^{\pi} \left(D_{yy}(e^{j\Omega}) - \frac{|D_{yx}(e^{j\Omega})|^2}{D_{xx}(e^{j\Omega})}\right) d\Omega $$
Causal Wiener filter
A causal Wiener filter allows us to predict future values of a random process $y[\cdot]$ given past values of a related process $x[\cdot]$.
To do this, we can create a model for $x[\cdot]$ that states that it is a filtered version of a white random process:
$$ x[n] = (f \ast w)[n] $$
Here, $w[\cdot]$ is a white random process with unit intensity, and $f[\cdot]$ is the unit sample response of a stable, causal system whose inverse is also stable and causal.
Given this model, we know that:
$$ R_{xx} = f \ast \overleftarrow{f} $$
(where $\overleftarrow{f}$ is the time-reversed version of $f$)
The PSD of $x$ can be written as:
$$ S_{xx}(z) = F(z)F(z^{-1}) $$
The transfer function of the causal Wiener filter is:
$$ H(z) = \frac{1}{F(z)} \left[ \frac{D_{yx}(z)}{F(z^{-1})} \right]_+ $$
where the plus sign $+$ in the subscript denotes that only the causal components of the transfer function are included. In other words, any positive powers of $z$ inside the brackets are discarded.
In the special case of using past values of a process $x[n]$ to predict a future value $x[n+m]$:
$$ H(z) = \frac{1}{F(z)} \left[ \frac{z^m F(z) F(z^{-1})}{F(z^{-1})} \right]_+ = \frac{1}{F(z)} \left[ z^m F(z) \right]_+ $$
MMSE of causal Wiener filter
$$ MMSE = \frac{1}{2\pi} \int_{-\pi}^{\pi} \left( D_{yy}(e^{j\Omega}) - H(e^{j\Omega})D_{xy}(e^{j\Omega}) - H(e^{-j\Omega})D_{yx}(e^{j\Omega}) + |H(e^{j\Omega})|^2 D_{xx}(e^{j\Omega}) \right) d\Omega $$