
The mean squared error of a predictor at $\mathbf{x}$ based on the stochastic Gaussian process is

$$MSE(\mathbf{x})=E[(\hat{y}(\mathbf{x})-y(\mathbf{x}))^2] =\sigma \big[1 - {\mathbf{r}(\mathbf{x})}^{T}\mathbf{R}^{-1}\mathbf{r}(\mathbf{x})-\frac{(1-\mathbf{1}^T\mathbf{R}^{-1}\mathbf{r}(\mathbf{x}))^2}{(\mathbf{1}^T\mathbf{R}\mathbf{1})}\big]\;, \label{eq:final-mse}$$

where $\mu$ is process’s mean and $\sigma^2\mathbf{R}$ is its covariance matrix over a sample $\mathcal{D}={(x^{(i)},y^{(i)})}_{1\leq i \leq n}$. For brevity, we will drop $\mathbf{x}$ henceforth. Before proceeding with the proof, let’s recall some terms and their definitions that will be useful in the proof. $$\hy=\hyexp$$

$$\hmu=\hmuexp$$

From the above, we have $E[y^2]=\sigma^2 + \mu^2$, $E[\mathbf{y}\mathbf{y}^{T}]=\sigma^2\R+\mu^2\mathbf{1}\mathbf{1}^T$.

Thus, we can expand the MSE term as

$$MSE= \sigma^2 + \mu^2 + E[{\hy}^2] - 2 E[y\hy]\;. \label{eq:mse}$$ Where $$E[\hy^2]=\frac{\st}{\oRo}+\mt + \st (\rRr) - \st \frac{(\oRr)^2}{\oRo} \label{eq:h2}$$ and

$$-2E[y\hy]=-2\st(\rRr)-2\mt-2\st\frac{\oRr}{\oRo}+2\st\frac{(\oRr)^2}{\oRo}\;. \label{eq:h3}$$

Plugging Equations \ref{eq:h2} and \ref{eq:h3} into Eq. \ref{eq:mse} results in Eq. \ref{eq:final-mse}.