I am learning about the cross entropy, defined by Wikipedia as $$H(P,Q)=-\text{E}_P[\log Q]$$ for distributions $P,Q$.
I'm not happy with that notation, because it implies symmetry, $H(X,Y)$ is often used for the joint entropy and lastly, I want to use a notation which is consistent with the notation for entropy: $$H(X)=-\text{E}_P[\log P(X)]$$
When dealing with multiple distributions, I like to write $H_P(X)$ so it's clear with respect to which distribution I'm taking the entropy. When dealing with multiple random variables, it thinks it's sensible to make precise the random variable with respect to which the expectation is taken by using the subscript $_{X\sim P}$. My notation for entropy thus becomes $$H_{X\sim P}(X)=-\text{E}_{X\sim P}[\log P(X)]$$
Now comes the point I don't understand about the definition of cross entropy: Why doesn't it reference a random variable $X$? Applying analogous reasoning as above, I would assume that cross entropy has the form \begin{equation}H_{X\sim P}(Q(X))=-\text{E}_{X\sim P}[\log Q(X)]\tag{1}\end{equation} however, Wikipedia makes no mention of any such random variable $X$ in the article on cross entropy. It speaks of
the cross-entropy between two probability distributions $p$ and $q$
which, like the notation $H(P,Q)$, implies a function whose argument is a pair of distributions, whereas entropy $H(X)$ is said to be a function of a random variable. In any case, to take an expected value I need a (function of) a random variable, which $P$ and $Q$ are not.
Comparing the definitions for the discrete case: $$H(p,q)=-\sum_{x\in\mathcal{X}}p(x)\log q(x)$$ and $$H(X)=-\sum_{i=1}^n P(x_i)\log P(x_i)$$
where $\mathcal{X}$ is the support of $P$ and $Q$, there would only be a qualitative difference if the events $x_i$ didn't cover the whole support (though I could just choose an $X$ which does).
My questions boil down to the following:
Where is the random variable necessary to take the expected value which is used to define the cross entropy $H(P,Q)=-\text{E}_{P}[\log Q]$
If I am correct in my assumption that one needs to choose a random variable $X$ to compute the cross entropy, is the notation I used for (1) free of ambiguities.