I am writing a video game to teach kids how to read, and as part of that I have to play short audio clips that make basic sounds. For example, the /t/ sound (the first sound in "Tiger").
I will also play longer syllable sounds, like "strong" and "est". And finally I will play full word sounds, like "brontosaurus" (a little over 1 second in length).
I've read that latency can be an issue with shorter sounds, so it's best to use a non-compressed file format like .wav for short files. The files will be larger, but the impact on disk storage won't be so much since the audio files are short.
For longer sounds like "brontosaurus", I would be inclined to keep using the same file format (.wav) just for consistency of sound assets. But I will have over 5000 words being said in my program, and so disk space may become an issue. If so, I could use a compressed format and perhaps preload it.
I have spoken dialogue in the game as well, but plan to just use mp3 for that. A small delay there would be fine.
Does it make sense to use .wav for all shorter sounds? Is there a number of seconds at which it makes sense to switch to a compressed audio format?