(Note: This is the correct analysis for the $k$ queries being not necessarily distinct, but I've posted this as a separate answer because my previous one was already too long and messy.)
After the $N = nk$ balls are put in the $m$ bins, suppose a fraction $q$ of bins are empty. When we make the $k$ queries, the probability that a particular query encounters a non-empty bin is exactly $(1 - q)$, as the query might encounter any of the $m$ bins with equal probability. Further, even though the bins are not independent, the $k$ queries we make are completely independent, so the probability that they all encounter non-empty bins is exactly $(1-q)^k$.
The final answer (the probability) is got by considering all possible values for $q$, weighted by their probability: the probability of all $k$ queries seeing a nonempty bin is $$\sum_{\alpha} \Pr(q = \alpha)(1-\alpha)^k$$ where the sum is over values $\alpha \in [0, 1]$ that $q$ can take, namely values of the form $\frac{r}{m}$ for $0 \le r \le m$. No approximation here.
The question of independence of the bins comes up when we try to say what the fraction $q$ might be. The probability of a particular bin (say bin $i$) being empty is $p = \left(1 - \frac1m\right)^N$, since for this bin to be empty, each of the $N$ balls must have gone to any of the other $(m - 1)$ bins other than this one. This probability $p$ is also the expected value of the fraction $q$.
(To prove this rigorously: define the indicator variable $X_i$ to be $1$ if bin $i$ is empty, and $0$ otherwise. The total number of empty bins $X = X_1 + X_2 + \dots + X_m$ has expected value $E[X] = E[X_1 + \dots + X_m] = E[X_1] + \dots + E[X_m] = mE[X_1]$ by linearity of expectation, so the expected value of the fraction $q$ (which is the same as $X/m$) is $E[q] = E[X/m] = E[X]/m = E[X_1] = p.$)
The actual value of $q$ can vary, and be different from its expected value $p$. However, the probability of its being far from $p$ is very low: as we move away from $p$, the probability of $q$ taking that value falls off exponentially. As $p \approx e^{-N/m}$, this also means that the probability of $q$ being far from $e^{-N/m}$ is also very low. (We still need to prove this.)
So the probability of all $k$ bins being nonempty, which is
$\sum_{\alpha} \Pr(q = \alpha)(1-\alpha)^k$, is effectively the same as $\left(1 - e^{-N/m} \right)^k$, as $q$ takes on a value around $e^{-N/m}$ with overwhelmingly high probability.
Left to prove: that $q$ is unlikely to be far from $e^{-N/m}$. It's a bit cumbersome to analyze the fraction $q$, as it depends on all the bins, and they are not independent. One approach is to bound the error caused by the independence assumption. Mitzenmacher and Upfal, in their book Probability and Computing: Randomized Algorithms and Probabilistic Analysis, give an elegant technique for doing precisely this.
Let us focus on a particular bin $i$ of the $m$ bins. Consider the number of balls in bin $i$, after $N$ balls have been dropped independently and uniformly at random into the $m$ bins. For each bin $i$, this number (call it $X_i$) follows the binomial distribution: the probability that the bin has $r$ balls is
$\Pr[X_i = r] = \binom{N}{r}\left(\frac1m\right)^r\left(1-\frac1m\right)^{N-r}.$ This is approximately a Poisson distribution with parameter $\lambda = \frac{N}{m}$, or in other words $\Pr[X_i = r] \approx e^{-\lambda}(\lambda^r / r!)$.
Motivated by this, for the $m$ bins all viewed together, they introduce what they call the Poisson approximation. Consider $m$ bins with the number of balls in each bin $i$ (call this number $Y_i$) as being independently distributed, following a Poisson distribution with parameter $\lambda = \frac{N}{m}$. Of course, under this distribution, the total number of balls across the $m$ bins could vary, though it's indeed $N$ in expectation. However, it is a surprisingly straightforward exercise to prove that
- The distribution of $(Y_1, \dots, Y_m)$ when conditioned on $\sum_{i=1}^{m} Y_i = N$ is the same as the distribution of $(X_1, \dots, X_m)$.
- Any event that takes place with some probability $p$ in the "Poisson approximation scenario" takes place with probability at most $pe\sqrt{N}$ in the "exact scenario".
This inequality above is proved by showing that for any nonnegative function $f(x_1, \dots, x_m)$ (such as the indicator function of an event), we have
$$\begin{align}
E[f(Y_1, \dots, Y_m)]
&\ge E\left[f(Y_1, \dots, Y_m) | \sum_i Y_i = N\right]\Pr(\sum_i Y_i = N) \\
&= E[f(X_1, \dots, Y_m)] \Pr(\sum_i Y_i = N) \\
&\ge E[f(X_1, \dots, Y_m)] (1 / e\sqrt{N})
\end{align}
$$
as $\Pr(\sum_i Y_i = N) = e^{-N} (N^N / N!)$ and $N! \le e\sqrt{N} (N/e)^N$.
So, they prove something like(?) $\Pr(|q - p|) \ge \epsilon) \le 2e\sqrt{m}e^{-m\epsilon^2 / 3p}$ using this and the Chernoff bound.
In fact, they prove using martingales and the Azuma-Hoeffding inequality (12.5.3, p. 308) that $\Pr(|q - p| \ge \frac{\lambda}{m}) \le 2\exp(-2\lambda^2/m)$.
They even have an exercise (12.19, p. 313) showing that $\Pr(|q - p| \ge \frac{\lambda}{m}) \le 2\exp(-\lambda^2(2m - 1) / (m^2 - p^2m^2))$.