# probability theory – Conditional Variance Exercise

This is a simple exercise on Conditional Variance that I’m trying to understand the proof of.

Let $$X$$ and $$Y$$ be two real-valued random variables such that $$Y$$ is square-integrable. We call the random variable $$mathbf{E}left((Y-mathbf{E}(Y mid X))^{2} mid Xright)$$ the conditional variance of $$Y$$ given $$X$$, denoted by $$operatorname{Var}(Y mid X)$$. Show that for all Borel measurable $$f: mathbf{R} rightarrow mathbf{R}$$ such that $$f(X)$$ is square-integrable,
$$mathbf{E}left((Y-f(X))^{2}right)=mathbf{E}(operatorname{Var}(Y mid X))+mathbf{E}left((mathbf{E}(Y mid X)-f(X))^{2}right).$$

Here is the solution, where I seem to be missing something simple.

We have
begin{aligned} mathbf{E}left((Y-f(X))^{2}right) &=mathbf{E}left(((Y-mathbf{E}(Y mid X))+(mathbf{E}(Y mid X)-f(X)))^{2}right) \ &=mathbf{E}left((Y-mathbf{E}(Y mid X))^{2}right)+2 mathbf{E}((Y-mathbf{E}(Y mid X))(mathbf{E}(Y mid X)-f(X))) \ &+mathbf{E}left((mathbf{E}(Y mid X)-f(X))^{2}right) end{aligned}
The first term in the last line equals $$mathbf{E}(operatorname{Var}(Y mid X)),$$ while the second term vanishes since $$mathbf{E}(Y mid X)-f(X)$$ is $$sigma(X)$$-measurable. The proof is complete.

Okay, but what does $$mathbf{E}(Y mid X)-f(X)$$ being $$sigma(X)$$-measurable have to do with anything? Why does it imply that $$2 mathbf{E}((Y-mathbf{E}(Y mid X))(mathbf{E}(Y mid X)-f(X))) = 0$$?