for my purpose, is there a real risk of hash collision, at any hash (even MD5)?
That depends on
- The number of possible inputs, and width of hash. For $2^s$ inputs chosen independently of the hash, and $w$-bit hash, the probability of collision is¹
$$p\lessapprox2^{2s-w-1}$$
when $2s<w-5$. So for $s=48$ (>30000 entries for each living human), $w=128$ (MD5), probability of collision is $p\approx2^{-33}$ (1 chance in 8 billion, about the probability that a randomly selected living human is you).
- If adversaries actively try to create collisions, or if in doubt
- $s=48$ is way too low! In fact, facing adversaries able to choose messages (in full, or in part with knowledge of the rest), $s$ must be defined not by how many things we hash, but how many things adversaries can hash. We are talking $s\approx53$ if adversaries use a single GPU for one year. If we trust that source, bitcoin mining contributes to the ruin of our ecosystem at a rate of $2^{93.5}$ hashes per year, using specialized ASICs, thus we should use $s>94$ if we assume comparable waste can occur against our system.
- It's unwise to use a broken hash, such as MD5 or SHA-1, though the restriction in character set at the input mitigates existing better-than-brute-force attacks to a sizable degree.
To be on the safe side, you can use SHA-256 or the typically faster² SHA-512/256 ($w=256$). If space is an issue, SHA-512/224 ($w=224$) which limits each hash to 28 bytes. See FIPS 186-4.
If speed matters, there's Blake2/3, which are competitive with MD5 on speed. It's OK to truncate such hash to save space, within the limits of the above formula.
since I know it's a comma delimited list of numbers, I can limit input to [0-9,] and ensure that nothing else will ever exist. No unicode or hidden character nonsense.
When using an unbroken hash, such considerations are unnecessary.
¹ For a derivation, see my Birthday problem for cryptographic hashing, 101, "assuming $n\ll\sqrt k$", "additionally assuming large $n$". In that source $n=2^s$ and $k=2^w$, thus $p\lessapprox{\frac{n^2}{2k}}$ yields our $p\lessapprox2^{2s-w-1}$.
² For messages larger than 55 bytes, and without hardware assistance, SHA-512 is often faster than SHA-256 on 64-bit CPUs, because it makes good use of 64-bit word.