A universal hash family[1] (paywall-free) $H_r\colon U \to S$ is a set of functions on an input space $U$ into a hash space $S$ indexed by a random $r$ so that the collision probability $\Pr[H_r(a) = H_r(b)]$ is bounded by a small constant like $1/\#S$ for any distinct $a \ne b$. Sometimes, we are more concerned with a bound on the difference probability $\Pr[H_r(a) - H_r(b) = \delta]$ for any inputs $a \ne b$ and difference $\delta$, where $S$ has some group structure, e.g. if we are going to use this to make a message authenticator[2]; this implies a bound on the collision probability by choosing $\delta = 0$.
A typical example is a polynomial evaluation hash:
Fix a field $k$, let the hash space $S$ be the field $k$, and let the input space $U$ be the set of polynomials of degree at most $\ell$ over $k$ with zero constant term; then define $H_r(p) = p(r)$ with $r$ uniformly distributed in $k$. If $H_r(p) = H_r(q)$, then $p(r) - q(r) = 0$, so $r$ is a root of a polynomial of degree $\ell$, of which there are at most $\ell$ possibilities each with probability $1/\#k$, so $$\Pr[H_r(p) = H_r(q)] \leq \frac{\ell}{\#k}.$$
To evaluate a polynomial $a_1 r^\ell + a_2 r^{\ell - 1} + \dots + a_\ell r$, you can use Horner's rule $(\cdots ((a_1 \cdot r + a_2) \cdot r + a_3) \cdots) \cdot r + a_\ell$, which costs exactly $\ell$ multiplications and $\ell - 1$ additions in the field $k$. The simple algebraic structure means you can easily vectorize it in a $t$-way vector unit by precomputing powers $(r, r^2, r^3, \dots, r^t)$ of $r$.
Two popular polynomial evaluation hashes are Poly1305[3] and GHASH[4], over fields of size $\#k \approx 2^{128}$. They are so popular that your web browser is probably using at least one of them right now to communicate with the crypto.stackexchange.com web servers. Poly1305 uses the prime field $\mathbb Z/(2^{130} - 5)\mathbb Z$ which admits fast software implementations; GHASH uses the binary field $\operatorname{GF}(2^{128})$ which admits cheap hardware implementations. They can process data at under 1 cycle per byte on typical CPUs, many times faster than any collision-resistant hash, even the fastest contenders like BLAKE2b, whose fastest implementations on the fastest CPUs haven't broken 2.5 cpb on eBACS even on messages many times longer than yours and which usually run more like 6-12 cpb on message lengths closer to yours.
A CRC works too—viewed this way, it is called a polynomial division hash[5][6]:
Fix a hash length $n$, let the input space $U$ be the set of polynomials of degree at most $\ell$ over $\operatorname{GF}(2)$, and let hash space $S$ be the set of polynomials of degree $n$ over $\operatorname{GF}(2)$; then define $H_r(p) = (p\cdot x^n) \bmod r$, where $r$ is a uniform random irreducible polynomial of degree $n$, of which there are $\pi_2(n)$ possibilities.*
Note that $H_r(a + b) = H_r(a) + H_r(b)$, so $\Pr[H_r(p) = H_r(q)] = \Pr[H_r(p - q) = 0]$. In this event, $((p - q) \cdot x^n) \bmod r = 0$, so $p - q \equiv 0 \pmod r$, meaning $r$ must divide $p - q$. Since $\operatorname{GF}(2)[x]$ is a unique factorization domain, $p - q$, a polynomial of degree at most $\ell$, has at most $\ell/n$ factors of degree $n$, each with probability $1/\pi_2(n)$, so $$\Pr[H_r(p) = H_r(q)] = \Pr[H_r(p - q) = 0] \leq \frac{\ell/n}{\pi_2(n)}.$$ In the best case that $n$ is prime, $\pi_2(n) = (2^n - 2)/n$, so this reduces to $$\Pr[H_r(p) = H_r(q)] \leq \frac{\ell/n}{(2^n - 2)/n} = \frac{\ell}{2^n - 2}.$$ However, it is not convenient to use prime bit lengths. If $n = 2^d$ for some $d$, like $n = 128 = 2^8$, then $\pi_2(n) = (2^n - 2^{n/2})/n$, so we only have $$\Pr[H_r(p) = H_r(q)] \leq \frac{\ell/n}{(2^n - 2^{n/2})/n} = \frac{\ell}{2^n - 2^{n/2}}.$$ Fortunately, this is plenty for security.
To compute a polynomial division hash, you need:
A method to choose irreducible polynomials uniformly at random, like Rabin's[7].
This is considerably costlier than choosing a (near-)uniform random element of a field. You can make it cheaper by choosing a product of smaller irreducible polynomials instead[8], at a cost to the collision probability that is exponential in the number of irreducible factors.
A method to compute division in $\operatorname{GF}(2)[x]$.
This is cheap in hardware, but most conventional hardware—like the Intel or ARMv8 CPU instructions—is limited to a fixed CRC polynomial so you usually can't take advantage of it unless you're designing custom hardware.
In software, the conventional approach is to precompute a table for byte-sized or word-sized inputs, but precomputing a table is costly—on top of the cost of choosing an irreducible polynomial uniformly at random—and using a table with secret inputs invites timing side channels. Going bit by bit in constant time instead is extremely expensive. (Of course, this is not much different from situation with GHASH—which is why Poly1305 is safer than GHASH, and crypto_secretbox_xsalsa20poly1305 is safer than AES-GCM.)
Beware: If the adversary learns anything about the hashes themselves, or finds two messages that collide, then any security of the universal hash evaporates[9]. For example, an adversary with a stopwatch may be able to ascertain whether two elements in a hash table collided, and exploit that for denial of service by flooding it with collisions (‘hash flood’). If that may happen, what you may want is a PRF, like HMAC-MD5, keyed BLAKE2, KMAC128, etc. PRFs are generally costlier than universal hashes but may be cheaper than collision-resistant hashes. A common example of a small PRF used for hash tables is SipHash[10]. Universal hashes can be used to cheaply extend short-input PRFs into long-input PRFs[11].
* The number $\pi_q(n)$ of irreducible polynomials of degree $n$ over the finite field $\mathbb F_q$ is[12][13] $$\frac 1 n \sum_{d \mid n} \mu(n/d) \, q^d,$$ where $\mu$ is the Möbius function.