Presumably the author assumed that all readers are familiar with diagonal sequence constructions and therefore just gave a broad outline, skipping some important details of the procedure.
You don't choose the subsequences independently. One uses the diagonal sequence construction to make sure that the selected subsequence (the diagonal sequence) is a subsequence - except for finitely many terms at the start - of all of the selected subsequences. To ensure that, every subsequence $(a_{n(i+1,j)})_{j\in \mathbb{N}}$ must be a subsequence of $(a_{n(i,j)})_{j\in \mathbb{N}}$.
So you start by choosing a subsequence $(a_{n(0,j)})_{j\in \mathbb{N}}$ of $(a_j)_{j\in \mathbb{N}}$ such that the sequence $\bigl(f_0(a_{n(0,j)})\bigr)_{j\in \mathbb{N}}$ converges. Then you choose a subsequence $(a_{n(1,j)})_{j\in \mathbb{N}}$ of $(a_{n(0,j)})_{j\in \mathbb{N}}$ such that the sequence $\bigl(f_1(a_{n(1,j)})\bigr)_{j\in \mathbb{N}}$ converges. Since it is a subsequence of $(a_{n(0,j)})$, the sequence $\bigl(f_0(a_{n(1,j)}\bigr)$ is also convergent. One continues in that way, if the subsequences $(a_{n(i,j)})_{j\in\mathbb{N}}$ for $i \leqslant k$ have been selected, in the next step we choose a subsequence $(a_{n(k+1,j)})_{j\in \mathbb{N}}$ of $(a_{n(k,j)})_{j\in \mathbb{N}}$ such that $\bigl(f_{k+1}(a_{n(k+1,j)})\bigr)_{j\in \mathbb{N}}$ converges. By construction, since $(a_{n(k+1,j)})$ is a subsequence of $(a_{n(i,j)})$ for $i \leqslant k$, it follows that $\bigl(f_i(a_{n(k+1,j)}\bigr)$ converges for all $i \leqslant k+1$.
Then the diagonal sequence - given by $b_j = a_{n(j,j)}$ - is a subsequence of $(a_j)$, and, importantly, for each $i\in \mathbb{N}$ the tail $(b_j)_{j \geqslant i}$ is a subsequence of $(a_{n(i,j)})_{j\in \mathbb{N}}$. From that fact, we obtain that the sequence $\bigl(f_i(b_j)\bigr)_{j\in\mathbb{N}}$ converges for every $i$.
Let's have a complete proof (not completely complete, we're not proving all tools we use) of Šmulian's theorem here:
Definition: Let $E$ be a topological vector space and $E'$ its (topological) dual. A set $S\subset E'$ is called separating if $\bigcap\limits_{\lambda\in S} \ker \lambda = \{0\}$.
Your reference calls such sets total, but since there is another notion of totality in topological vector spaces, namely that the set spans a dense subspace, I prefer to use a different term. These two notions aren't unrelated, however. If $E$ is such that $E'$ is separating (e.g. if $E$ is a Hausdorff locally convex space), then a subset of $E'$ is separating if and only if its span is $\sigma(E',E)$-dense, i.e. if and only if it is $\sigma(E',E)$-total.
Definition: We say that a topological vector space $E$ has a countable separation, if there is a countable separating subset of $E'$.
(I don't like the term "countable separation", but it's the best I could come up with. Mathematics is easier than computer science in so far as we don't have to deal with cache invalidation, but naming things and off-by-one-errors are still hard.) We note in passing that a TVS having a countable separation is necessarily Hausdorff.
Lemma: Let $(X,\tau)$ be a quasicompact space. If $\tau' \subset \tau$ is a Hausdorff topology, then $\tau' = \tau$.
Proof: The identity $\operatorname{id} \colon (X,\tau) \to (X,\tau')$ is a continuous (since $\tau'\subset\tau$) and closed - if $F$ is $\tau$-closed, it is $\tau$-quasicompact, hence $F$ is $\tau'$-quasicompact, and since $\tau'$ is Hausdorff, $\tau'$-closed - bijection, i.e. a homeomorphism.
The main part of the proof of Šmulian's theorem is
Theorem: Let $E$ be a TVS having a countable separation. Then every (relatively) weakly compact subset of $E$ is (relatively) weakly sequentially compact.
We give two proofs of this theorem, one more topological, and a slightly modified version of the proof in your reference.
First proof: Let $S\subset E'$ be a countable separating set. For $\lambda \in E'$, we let $p_\lambda \colon x \mapsto \lvert \lambda(x)\rvert$. This is a $\sigma(E,E')$-continuous seminorm on $E$, and the countable family $\{p_{\lambda} : \lambda \in S\}$ defines a first-countable locally convex topology $\tau_S$ on $E$. Since $S$ is separating, $\tau_S$ is a Hausdorff topology, hence metrisable. Since all of the seminorms generating $\tau_S$ are $\sigma(E,E')$-continuous, it follows that $\tau_S \subset \sigma(E,E')$. In fact, $\tau_S = \sigma(E, \operatorname{span} S)$, so the inclusion is strict, unless $S$ spans $E'$ (which never happens if $E$ is an infinite-dimensional normed space).
If $A\subset E$ is weakly compact, by the lemma we have $\tau_S\lvert_A = \sigma(E,E')\lvert_A$, hence $\sigma(E,E')\lvert_A$ is metrisable. Metrisable compact spaces are sequentially compact.
Second proof: This is, if I understood correctly, essentially the proof in Dunford-Schwartz. I only include it because I think the version in Nygaard's paper that you linked to is not detailed enough (and hence somewhat unclear) for the beginner. In principle, as a more topologically-minded person, I find the first proof much more instructive.
Let $\Lambda = \{ \lambda_n : n \in \mathbb{N}\}$ be a countable separating subset of $E'$, and $A\subset E$ weakly compact. Let $(a_n)_{n\in \mathbb{N}}$ be a sequence in $A$ (if $A = \varnothing$, there is nothing to prove). For every $k\in\mathbb{N}$, $\lambda_k(A)$ is a compact subset of $\mathbb{C}$ (or $\mathbb{R}$), so from every sequence in $\lambda_k(A)$ we can extract a convergent subsequence. We begin with the sequence $\bigl(\lambda_0(a_n)\bigr)_{n\in\mathbb{N}}$ and find a strictly increasing map $\sigma_0 \colon \mathbb{N}\to \mathbb{N}$ such that the sequence $\bigl(\lambda_0(a_{\sigma_0(n)})\bigr)_{n\in\mathbb{N}}$ is convergent. We let $k(0,n) := \sigma_0(n)$ for $n\in \mathbb{N}$. Having found $k(j,\,\cdot\,)$ for $0 \leqslant j \leqslant i$, we consider the sequence $\bigl(\lambda_{i+1}(a_{k(i,n)})\bigr)_{n\in\mathbb{N}}$ and by the (sequential) compactness of $\lambda_{i+1}(A)$ we find a strictly increasing $\sigma_{i+1} \colon \mathbb{N} \to \mathbb{N}$ such that $\bigl(\lambda_{i+1}(a_{k(i,\sigma_{i+1}(n))})\bigr)_{n\in \mathbb{N}}$ converges. We set $k(i+1,n) := k(i,\sigma_{i+1}(n))$. Continuing in this way, we obtain a map $k \colon \mathbb{N}\times \mathbb{N} \to \mathbb{N}$ such that for every $i\in \mathbb{N}$ the sequence $s_i \colon n \mapsto k(i,n)$ is strictly increasing, and $s_{i+1}$ is a subsequence of $s_i$ such that $\bigl(\lambda_{i+1}(a_{s_{i+1}(n)})\bigr)_{n \in \mathbb{N}}$ converges. Taking the diagonal sequence, we find a subsequence $(b_n)_{n \in \mathbb{N}}$ of $(a_n)_{n \in \mathbb{N}}$, namely $b_n = a_{k(n,n)}$, such that $\bigl(\lambda_k(b_n)\bigr)_{n\in \mathbb{N}}$ converges for every $k \in \mathbb{N}$.
We assert that the sequence $(b_n)$ is weakly convergent. For $m \in \mathbb{N}$, let $F_m = \overline{\{ b_n : n \geqslant m\}}$, where the closure is taken with respect to $\sigma(E,E')$. Then $(F_m)$ is a nested sequence of nonempty (weakly) compact sets, so $F := \bigcap\limits_{m\in \mathbb{N}} F_m \neq \varnothing$. Let $y\in F$. We claim that $\lambda_k(y) = \lim\limits_{n\to \infty} \lambda_k(b_n)$ for every $k\in \mathbb{N}$. For that, fix arbitrary $k \in \mathbb{N}$ and $\varepsilon > 0$. Since $\bigl(\lambda_k(b_n)\bigr)$ is convergent, we can find an $m\in\mathbb{N}$ such that $\lvert \lambda_k(b_n) - \lambda_k(b_r)\rvert \leqslant \varepsilon$ forall $n,r \geqslant m$. Since $y \in F_m$, we can find an $n \geqslant m$ with $\lvert \lambda_k(b_n) - \lambda_k(y)\rvert \leqslant \varepsilon$. It follows that $\bigl\lvert \lambda_k(y) - \lim\limits_{n\to\infty} \lambda_k(b_n)\bigr\rvert \leqslant 2\varepsilon$. The claim follows because $k$ and $\varepsilon$ were arbitrary. Since $\Lambda$ is separating, we further conclude that $F = \{y\}$. Now let $\lambda \in E'$. Assume for the sake of contradiction that $\bigl(\lambda(b_n)\bigr)$ does not converge to $\lambda(y)$. Then we can extract a subsequence $(c_n)$ such that $\lvert \lambda(c_n) - \lambda(y)\rvert \geqslant \delta$ for all $n$ and some $\delta > 0$. By extracting a further subsequence, we can assume that $\bigl(\lambda(c_n)\bigr)$ converges. But since $(c_n)$ is a subsequence of $(b_n)$, we have $\{c_n : n \geqslant m\} \subset \{b_n : n \geqslant m\}$ for all $m\in \mathbb{N}$, and hence
$$\varnothing \neq \bigcap_{m \in \mathbb{N}} \overline{\{ c_n : n \geqslant m\}} \subset \bigcap_{m \in \mathbb{N}} F_m = \{y\},$$
and the argument above shows that $\lim\limits_{n\to\infty} \lambda(c_n) = \lambda(y)$, contradicting the assumption. Thus indeed $(b_n)$ converges weakly to $y$.
To complete the proof of Šmulian's theorem, we now need the
Proposition: Let $X$ be a separable Banach (or normed, completeness is irrelevant) space. Then $X$ has a countable separation.
Proof: Let $\{x_n : n \in \mathbb{N}\}$ be a countable dense subset of the unit sphere of $X$. For each $n$, choose $\lambda_n$ in the unit sphere of $X'$ with $\lambda_n(x_n) = 1$. Such a $\lambda_n$ exists by the Hahn-Banach theorem. Then $\{\lambda_n : n \in \mathbb{N}\}$ is separating. For if $\lVert x\rVert = 1$, choose $n$ so that $\lVert x_n - x\rVert < \frac{1}{2}$. Then
$$\lvert \lambda_n(x)\rvert = \lvert \lambda_n(x_n) - \lambda_n(x_n - x)\rvert \geqslant \lvert \lambda_n(x_n)\rvert - \lvert \lambda_n(x_n - x)\rvert \geqslant 1 - \lVert\lambda_n\rVert\cdot \lVert x_n - x\rVert > \frac{1}{2},$$
so $\lambda_n(x) \neq 0$.
Finally, we come to the
Theorem (Šmulian): Let $X$ be a Banach (or normed, completeness is unimportant here) space. Then every (relatively) weakly compact subset of $X$ is (relatively) weakly sequentially compact.
Proof: Let $A \subset X$ be weakly compact and $(a_n)_{n\in \mathbb{N}}$ a sequence in $A$ (again, for $A = \varnothing$ there is nothing to prove). Let $Y := \overline{\operatorname{span} \{ a_n : n \in \mathbb{N}\}}$. Then $Y$ is a separable closed subspace of $X$. Since subspaces are convex, $Y$ is weakly closed, and hence $B = A \cap Y$ is $\sigma(X,X')$-compact. By Hahn-Banach, $\sigma(X,X')\lvert_Y = \sigma(Y,Y')$, so by the proposition and the previous theorem, $B$ is $\sigma(Y,Y')$-sequentially compact. We can therefore extract a subsequence $(b_n)$ of $(a_n)$ such that $$b_n \xrightarrow{\sigma(Y,Y')} b \in B.$$ Since $\sigma(X,X')\lvert_Y = \sigma(Y,Y')$, we have $b_n \xrightarrow{\sigma(X,X')} b$, and the proof is complete.