Let $\mathrm{RF}_n$ be the set of random functions over $\{0,1\}^n$. Let $F_{f,i}$ be an oracle that answers the first $i$ queries with $F_{f(\cdot)}(\cdot)$ and answers the remaining queries with $f(\cdot)$. Now suppose a polynomial-time distinguisher $D$ can distinguish $\{f\leftarrow \mathrm{RF}_n:F_{f(\cdot)}(\cdot)\}$ and $\{f\leftarrow \mathrm{RF}_n:f(\cdot)\}$, it can also distinguish $\{f\leftarrow \mathrm{RF}_n:F_{f,k-1}(\cdot)\}$ and $\{f\leftarrow \mathrm{RF}_n:F_{f,k}(\cdot)\}$ for some $k$.
Now construct a new distinguisher $D'$. Given an oracle $\mathcal{O}(\cdot)$, it selects a random function $f$, then calls $D$. Each time $D$ gives a query $x$ (suppose it is the $i$th query), $D'$ returns
$F_{f(x)}(x)$ if $i<k$, or
$\mathcal{O}(x)$ if $i=k$, or
$f(x)$ if $i>k$.
We can see if $\mathcal{O}$ is $\{f\leftarrow \mathrm{RF}_n:f(\cdot)\}$, the queries $D$ made as well as their answers have the same distribution as the ones when $D$ is facing $\{f\leftarrow \mathrm{RF}_n:F_{f,k}(\cdot)\}$. If $\mathcal{O}$ is $\{s\leftarrow\{0,1\}^n:F_s(\cdot)\}$, the queries $D$ made as well as their answers have the same distribution as the ones when $D$ is facing $\{f\leftarrow \mathrm{RF}_n:F_{f,k-1}(\cdot)\}$. Since $D$ can distinguish $\{f\leftarrow \mathrm{RF}_n:F_{f,k-1}(\cdot)\}$ and $\{f\leftarrow \mathrm{RF}_n:F_{f,k}(\cdot)\}$, our distinguisher $D'$ can distinguish $\{f\leftarrow \mathrm{RF}_n:f(\cdot)\}$ and $\{s\leftarrow\{0,1\}^n:F_s\}$, which contradicts to the fact that $F$ is a PRF.
So $\{f\leftarrow \mathrm{RF}_n:F_{f(\cdot)}(\cdot)\}$ and $\{f\leftarrow \mathrm{RF}_n:f(\cdot)\}$ cannot be distinguished by any polynomial-time distinguisher. Your second part of the sum is negligible.