A: What if we choose $H$ to be a fixed function, yet instead of choosing a fixed matrix we choose a fixed ‘cryptographic hash function’ like SHAKE128? It is hard to say, because unlike universal hash families, cryptographic hash functions are designed not to have any interesting properties like linearity of matrix multiplication.
Suppose we model it as a uniform random function from $l$-bit strings to $k$-bit strings. For every distinct input $x$, the output $H(x)$ is an independent uniform random $k$-bit string—every possible output is equiprobable. This model is much stronger than a randomness extractor in some sense even though the adversary has access to it as a public oracle, and this model, like all models, is wrong, but some models are useful.
Suppose $X$ has $s \cdot l$ bits of min-entropy. What is the expected min-entropy of $H(X)$, over uniform random choice of $H$? To make a conservative estimate of this, let's say $X$ is really just a uniform random $\lambda = \lfloor s \cdot l \rfloor$-bit string. For any fixed function $h$, the min-entropy of $h(X)$ is $-\max_y \log_2 \Pr[h(X) = y]$. Note that $\Pr[h(X) = y] \leq (C(h) + 1)/2^k$, where $C(h)$ is the number of possible $k$-bit strings that $h$ does not reach, since in the worst case of min-entropy, there is a single output that is reached $C(h) + 1$ times instead. Consequently, $$-\max_y \log_2 \Pr[h(X) = y] \geq -\log_2 (C(h) + 1)/2^k = k - \log_2 (C(h) + 1).$$ $\log_2$ is concave, so $E[\log_2 (C(H) + 1)] \leq \log_2 (E[C(H)] + 1)$. What's $E[C(H)]$? First, the probability that we do not reach a particular output $y$ is
\begin{align*}
\Pr[\lnot\exists x. H(x) = y]
&= \Pr[\forall x. H(x) \ne y] \\
&= \prod_x \Pr[H(x) \ne y] \\
&= \prod_x (1 - \Pr[H(x) = y]) \\
&= \prod_x (1 - 1/2^k) \\
&= (1 - 1/2^k)^{2^\lambda}.
\end{align*}
By linearity of expectations, this is also the expected fraction of unreachable outputs, so the expected number of unreachable outputs is $E[C(H)] = 2^k (1 - 1/2^k)^{2^\lambda}$. When $\lambda \leq k$ this puts an unsatisfying bound on the min-entropy, which happens because to set a hard bound we made an extremely conservative estimate of a colossal collision of all inputs with any collisions into a single output that feels very abused right now.
But $E[C(H)]$ rapidly goes to zero as $\lambda$ increases. Specifically, for $k \geq 1$, we have $1 - 1/2^k \leq e^{-1/2^k}$, so $$E[C(H)] \leq 2^k e^{-2^\lambda/2^k} = 2^k e^{-2^{\lambda - k}} \leq 2^{k - 2^{\lambda - k}}.$$ Consequently, as long as $\lambda \geq k + \log_2 k$, $E[C(H)] \leq 1$, so the min-entropy is at least $k - \log_2 (E[C(H)] + 1) \geq k - 1$.
For a distribution on $k$-bit strings with min-entropy $k - \delta$, where $\delta = \log_2 (E[C(H)] + 1)$, can we set an upper bound on the total variation distance from uniform? Suppose for simplicity that $N = 2^{k - \delta}$ is an integer; the greatest TVD is attained by assigning probability $1/N$ to $N$ of the $k$-bit strings, and probability $0$ to the remaining $2^k - N$, so that the TVD is bounded by
\begin{multline}
\varepsilon
= \frac 1 2 N \biggl|\frac{1}N - \frac{1}{2^k}\biggr| + \frac 1 2 (2^k - N) \biggl|0 - \frac{1}{2^k}\biggr| \\
= \frac 1 2 N \frac{2^k - N}{N 2^k} + \frac 1 2 \cdot \frac{2^k - N}{2^k}
= \frac 1 2 \cdot \frac{2^k - N}{2^k} + \frac 1 2 \cdot \frac{2^k - N}{2^k} \\
= \frac{2^k - N}{2^k}
= \frac{2^k - 2^{k - \delta}}{2^k}
= 1 - 2^{-\delta}.
\end{multline}
For $\delta = 1$, the the TVD is bounded by $\varepsilon = 1 - 2^{-1} = 1/2$, but the bound $\varepsilon = 1 - 2^{-\delta}$ rapidly approaches zero as $\delta \to 0$.
A more practical way to put it is: The security conjecture of, e.g., SHAKE128 is that there is no better attack at guessing $X$ given $\operatorname{SHAKE128-}\!k(X)$ than a generic search through all possible values of $X$, whose expected cost is at least $2^{H_\infty[X]}/2$ trials, or $2^k/2$, or $2^{256}/2$, whichever is smaller.
(Of course, if there are $n$ targets $X_1, X_2, \dots, X_n$, the cost to find at least one of them given $\operatorname{SHAKE128-}\!k(X_1),$ $\operatorname{SHAKE128-}\!k(X_2),$ $\dots,$ $\operatorname{SHAKE128-}\!k(X_n)$ is $2^{\min\{256, H_\infty[X_1], \dots\}}/(2n)$ instead, i.e. there is a standard factor of $n$ cost reduction for an $n$-way multi-target attack. This is why it is a good idea either to use 256-bit seeds $X_i$, or to use a globally unique per-seed salt.)