It is often useful in measure-theoretic arguments, to firstly prove something for indicator functions and then try to approximate. We'll proceed as follows.
Let $(\Omega,\mathcal F,\mathbb P)$ be probability space, let $X:\Omega \to \mathbb R$ be $\mathcal X \subset \mathcal F$ measurable. Let $\Psi: \mathbb R \times \Omega \to \mathbb R$ be $\mathcal B(\mathbb R) \otimes\mathcal Y $ measurable, where $\mathcal X,\mathcal Y$ are independent.
Assume firstly that $\Psi$ has a special form like: $\Psi(x,\omega) = f(x)\eta(\omega)$, where $f:\mathbb R \to \mathbb R$ is borel, and $\eta:\Omega \to \mathbb R$ is $\mathcal Y$ measurable. Then you can prove equality:
$$ \mathbb E[f(X)\eta | \mathcal X] = H(X) = \mathbb E[f(X)\eta | X] $$ where $H(x) = \mathbb E[f(x)\eta]$.
Indeed, for the first one, take any $A \in \mathcal X$. We have to check whether $$\mathbb E[1_A f(X) \eta ] = \mathbb E[1_A H(X)] $$ holds or not. Note that $$ \mathbb E[1_AH(X)]= \int_{\mathbb R^2} h(x)z d\mu_{(X,1_A)}(x,z) = \int_{\mathbb R^2}\int_{\mathbb R} f(x)yz d\mu_{\eta}(y)d\mu_{(X,1_A)}(x,z)$$
Moreover, random vector $(X,1_A)$ is independent of $\eta$ (since $\eta$ is $\mathcal Y$ measurable, which is independent of $\mathcal X$, and $(X,1_A)$ is $\mathcal X$ measurable), so by Fubinii we can get:
$$ E[1_Af(X)\eta] = \int_{\mathbb R^3} f(x)yz d\mu_{(X,\eta,1_A)}(x,y,z) = \int_{\mathbb R^2} \int_{\mathbb R} f(x)yz d\mu_{\eta}(y)d\mu_{(X,1_A)}(x,z) $$ Since $A \in \mathcal X$ was arbitrary, we have $\mathbb E[f(X)\eta | \mathcal X] = H(X)$. To get the second equality, note that $H(X)=\mathbb E[H(X)|X] = \mathbb E[\mathbb E[f(X)\eta | \mathcal X] | X] = \mathbb E[f(X)\eta | X]$, by tower property.
We will try to show the same for arbitrary $\Psi$ now. Our goal is to show:
$$ \mathbb E[\Psi(X,\cdot) | \mathcal X] = H(X) = \mathbb E[\Psi(X,\cdot)|X]$$ where again $H(x) = \mathbb E[\Psi(x,\cdot)]$ (what is equal to: $ \int_{\Omega}\Psi(x,\omega)d\mathbb P(\omega)$ )
Take any $D \in \mathcal X$. We start with $\Psi = 1_C$, where $C \in \mathcal B(\mathbb R) \otimes \mathcal Y$. Note that we showed the above result for $\Psi=f\eta$ where $f$ is borel and $\eta$ is $\mathcal Y$ measurable. We get the equality (looking at $\mathbb E[\Psi(X,\cdot)|\mathcal X] = H(X)$):
$$ \mathbb E[1_D f(X)\eta ] = \mathbb E[1_D H(X)] $$ what can be rewritten as:
$$ \int_D f(X(\omega))\eta(\omega)d\mathbb P(\omega) = \int_D \int_\Omega f(X(\omega))\eta(\omega') d\mathbb P(\omega')d\mathbb P(\omega)$$
The point is, we can approximate any $\Psi=1_C$ for $C \in \mathcal B(\mathbb R) \otimes \mathcal Y$ by $\Psi_n=f_n\eta_n$ where $f_n$ is $\mathcal B(\mathbb R)$ measurable and $\eta_n$ is $\mathcal Y$ measurable. Moreover since we're approximating indicator function, we can choose $f_n,\eta_n$ with values in $[0,1]$. So by lebesgue dominated convergence theorem (note functions are bounded by $1$ and we have probabilistic measures), we get equality (for any $C \in \mathcal B(\mathbb R) \otimes \mathcal Y$ and $D \in \mathcal X$):
$$ \int_D 1_C(X(\omega),\omega) d\mathbb P(\omega) = \int_D \int_{\Omega} 1_C(X(\omega),\omega')d\mathbb P(\omega')d\mathbb P(\omega) = \int_D H(X(\omega)) d\mathbb P(\omega)$$ Since $D$ was arbitrary, we get equality (important: only for $\Psi = 1_C$ where $C$ is any set in $\mathcal B(\mathbb R) \otimes \mathcal Y$) $$ \mathbb E[\Psi(X,\cdot) | \mathcal X] = H(X)$$
And finally we can use the method I mentioned at the beggining. Since it holds for any indicator function $\Psi$, it will hold (by linearity of expectation/conditional expectation) for any simple function (linear combination of indicator functions). We know, that any nonnegative boundend measurable function $\Psi$ can be approximated by NON-DECREASING sequence $(\psi_n)$ of simple functions. Hence by monotone convergence theorem (which we should know holds both for expected and conditional expected value) we get the result for any $\Psi$ which is non-negative and bounded measurable function. Now taking any $\Psi$ which is bounded and measurable (by measurable I mean $\mathcal B(\mathbb R) \otimes \mathcal Y$ measurable), we can write $\Psi= \Psi^+ - \Psi^-$, where $\Psi^+,\Psi^-$ are nonnegative, bounded and measurable functions, so again by linearity, the result holds for arbitrary $\Psi$.
We showed that $$ \mathbb E[\Psi(X,\cdot)|\mathcal X] = H(X)$$ for any $\Psi$, to get the second part we just need (as above) condition on $X$:
$$ H(X) = \mathbb E[H(X)|X] = \mathbb E[\mathbb E[\Psi(X,\cdot)|\mathcal X]|X] = \mathbb E[\Psi(X,\cdot)|X]$$ by Tower property. Hence the result follows.