When we talked about a cryptographic hash function, we want them to process arbitrary length inputs and fixed outputs.
\begin{align}
H:&\{0,1\}^*\to \{0,1\}^\ell\\
m&\mapsto H(m)
\end{align}
where the $\ell$ is the size of the hash function.
Hash functions based on Merkle–Damgård construction† use a compression function $f$ to achieve the fixed size output;
- The message $M$ is padded and divided into $\ell$ length blocks $M_1,\ldots,M_n$
- $H_0$ is set to initial values;
- From $1$ to $n$ $$H_i = f(H_{i-1},M_i)$$
- Output $H(M) = H_n$
A changeable output size according to input size is hard to build and is not feasible for protocols. Building a hash function with a larger output then truncating is much safer as in SHA-224 ( note: the parameters are different then SHA-256)
By the simple combinatorial argument, since the input space is much larger than the output space, the pigeonhole principle implies that there will be more than one input value maps to the same hash value. Indeed, with the arbitrary size of the input, there will be numerous inputs will hash the same value. As pointed in poncho's answer, we want collision-resistance for hash functions;
- If finding two inputs that hash to the same output $a$ and $b$ such that $H(a)= H(b)$, $a \neq b$ is hard then we have collision resistance.
Collision resistance considered an easier security goal to achieve, Joux 2004. There is a generic attack by birthday paradox that after $2^{\ell/2}$ hash calculations we expect a collision with 50%. To have resistance to generic birthday attacks, one has to use a hash function double size of the threat. As an example, the SHA-1 output size is 160-bit with 80-bit generic birthday attack it is no longer recommended by NIST;
SHA-1: Federal agencies should stop using SHA-1 for generating digital signatures, generating time stamps and for other applications that require collision resistance. Federal agencies may use SHA-1 for the following applications: verifying old digital signatures and time stamps, generating and verifying hash-based message authentication codes (HMACs), key derivation functions (KDFs), and random bit/number generation. Further guidance on the use of SHA-1 is provided in SP 800-131A.
August 5, 2015
For other properties of hash functions, see How do hashes really ensure uniqueness?.
† SHA-3 uses Sponge function.