Could 64 bits be used instead?
Yes, but it reduces the security.
Assuming that the message length is $L$ blocks, then given a valid message, tag pair, the attacker can generate a second message/tag pair that would authenticate with probability $L/q$ (assuming, of course, that $L \le q$). By reducing the value of $q$, you're increasing the probability that the attacker would succeed.
Now, you could run it several times in parallel (using distinct random values for $k_1, k_2$ for each instance; lets example how it works.
For example, if you use a single 128 bit $q$, and your message length is $\ell$ bytes (and hence we have $L = \ell/16$), that means that the attacker can succeed with probability $2^{-132} \ell$.
In contrast, if you have a 64 bit $q$, and run it twice in parallel, and your message length is $\ell$ bytes (and so $L = \ell/8$), this means that the attacker can succeed with probability $2^{-134} \ell^2$ (note, this is always larger, even with the smaller constant factor, as we assumed that $\ell$ is a multiple number of blocks, and so $\ell \ge 16$).
Because the second case reduced security as a square of the message length, it degrades significantly faster if long messages were possible. Whether it degrades too fast would depend on a) how much forgery probability is tolerable, and b) how long your messages are anyways.
Can this be scaled even further downwards to 32, 16, … bits? (block size and $q$)
Yes, but you run into even faster degradation based on message length.