This question was adressed here quite a few times, just differently phrased, for example in Cycles in SHA256.
Another awnser here had the following statement:
Hashes have a fixed size output. After one round, all your new inputs will be equally sized, being the size of the hash output. An ideal hash function operating on such inputs will be bijective and you will constantly just be rearranging your inputs with none of them ever colliding.
Ideal hash function in this sense is quite similar to the definition of a perfect hash function. However, this just does not fit with cryptographic hash functions, where the "ideal" version is a random oracle or a truly random function (with the specified domain), where collisions can happen.
Exactly this question was already adressed here:
But to come back to the question: Cryptographic hashes are designed to be as close to random functions as possible. In an answer to the first linked question, fgrieu drew a really nice visualization here.
A few of the key points of what to expect:
- The graph is probably disconnected
- The graph contains cycles of different leangth
- It might contain fixed points (cycles of length 1)
- It also contains nodes, which lead to a cycle but are not part of it.
So to answer the initial questions:
I believe that most iterations of the while loop will decrease the size of inputs, untill it reaches 1. Is this true?
No, it can decreate the size of the set. But with a cryptographic hash it is unlikely. The set size only decreases in case of a collision, which is really unlikely (and surely not "most iterations").
Considering the second part of the question: That could happen, but it is unlikely. If we remember the graph of the random function, then the original set have to be
- in the same connected subgraph.
- Have a fixed point instead of a cycle; or alternatively there have to be collisions, so that there is only one element in the cycle.
Furthermore, can we somehow determine how many iterations this would take for a given input space?
Well, for practical purposes with MD5: Much, much too long. The cycles can have any length within the graph, and you have to save all previous steps to actually notice that you are in a cycle. With a graph of $2^{128}$ nodes, you would have to estimate the number of nodes in cycles, and then estimate how many values you need to store to be able to determine that you are in a cycle. It is quite likely, you need close to $2^{128}$ steps anyway.