I don't think the question can be answered as posed.
The spectral efficiency of a given channel, in bits per second per Hz (bps/Hz), will depend on the signal-to-noise ratio (SNR) in the channel as well as the minimum (bit) error rate that can be tolerated.
The figure below from this paper illustrates how these three three parameters relate for various modulation schemes for one particular bit error rate (BER) - in this case $10^{-2}$ (which is really poor). Presumably the family of curves would look different - perhaps much different - for higher BER's.

- If the SNR is quite high - say better than 30 dB - then something like 7 bps/Hz or better could be achieved with 256-QAM (we'll come back to the "adaptive scheme" presented in the paper).
- Notice, however, that not only does the channel efficiency drop with SNR, but the quasi-optimal modulation scheme also changes. For example, with only 5 dB less SNR (i.e. 25 dB), 256-QAM drops from 2nd to 4th place and 64 QAM emerges as the best conventional modulation scheme. Channel efficiency drops by more than 20% to about 5.5 bps/Hz
- At 20 dB SNR, 16-QAM yields the best performance, albeit with an additional 30% drop in channel efficiency to about 3.5 bps/Hz.
- As SNR continues to drop you can see that the quasi-optimal scheme shifts from 16-QAM to 4-QAM (i.e. QPSK) to BPSK.
The figure gives some insight into how the WiFi standards are written around adaptive modulation (i.e. changing schemes depending on SNR - see this note) and why a WiFi network slows down so much when the signal is weak.
Another point to raise here is that there is no theoretical "best" modulation scheme to my knowledge for any given BER and SNR. As a result, designing new modulation schemes has a favorite pastime of communication engineers since at least the 1930's. "Adaptive" schemes like the one shown in the figure are usually some algorithm that selects the best of $N$ available modulation schemes for the SNR measured in the channel. $M$-QAM is somewhat uncomplicated and is therefore used in most benchmarks, but other more exotic modulation schemes can yield marginally better results at the expense of complexity.