The example linked in the question computes $\operatorname{SHA3-224}(M)$ for $M$ the 5-bit bitstring $\mathtt{11001}$.
FIPS 202 section 6.1 defines
$\operatorname{SHA3-224}(M)=\operatorname{KECCAK}[448](M\mathbin\|\mathtt{01},224)$
thus it's computed $\operatorname{KECCAK}[448](\mathtt{11001}\mathbin\|\mathtt{01},224)$ that is $\operatorname{KECCAK}[448](\mathtt{1100101},224)$
FIPS 202 section 5.2 defines
$\operatorname{KECCAK}[c]=\operatorname{SPONGE}[\operatorname{KECCAK-}p[1600,24],\operatorname{pad10*1},1600-c]$
thus it's computed $\operatorname{SPONGE}[\operatorname{KECCAK-}p[1600,24],\operatorname{pad10*1},1152](\mathtt{1100101},224)$
FIPS 202 section 4 algorithm 8 defines computing $\operatorname{SPONGE}[f,\operatorname{pad},r](N,d)$ to be (with parameter $b$ equal to the bit size of the input and output of $f$)
- Let $P=N\mathbin\|\operatorname{pad}(r,\operatorname{len}(N))$.
- Let $n=\operatorname{len}(P)/r$.
- Let $c=b-r$.
- Let $P_0,\ldots,P_{n-1}$ be the unique sequence of strings of length $r$ such that $P=P_0\mathbin\|\ldots\mathbin\|P_{n-1}$.
- Let $S=\mathtt0^b$.
- For $i$ from $0$ to $n-1$, let $S=f(S\oplus(P_i\mathbin\|\mathtt0^c))$.
- Let $Z$ be the empty string.
- Let $Z=Z\mathbin\|\operatorname{Trunc}_r(S)$
- If $d\le|Z|$, then return $\operatorname{Trunc}_d(Z)$; else continue.
- Let $S=f(S)$, and continue with Step 8.
thus step 1 computes $P=\mathtt{1100101}\mathbin\|\operatorname{pad10*1}(1152,7)$.
FIPS 202 section 5.1 algorithm 9 defines computing $\operatorname{pad10*1}(x,m)$ to be
- Let $j=(-m-2)\bmod x$
- Return $P=\mathtt{1}\mathbin\|\mathtt{0}^j\mathbin\|\mathtt{1}$
thus said $P=\mathtt{1100101}\mathbin\|\mathtt{1}\mathbin\|\mathtt{0}^j\mathbin\|\mathtt{1}$ with $j=1152-7-2$, that is $P=\mathtt{11001011\underbrace{000000\ldots000}_{1143\;\mathrm{ bits}}1}$.
Now if we keep on Algorithm 8, we have $n=1$, $c=1600-1152=448$, $P_0=P$, $S=\mathtt0^{1600}$ at step 5. So that at the first (and only) evaluation of $f$ at step 6, the input is $\mathtt{11001011\underbrace{000000\ldots000}_{1143\;\mathrm{ bits}}1\underbrace{000000\ldots000}_{448\;\mathrm{ bits}}}$ or, if we make byte boundaries explicit
$$\mathtt{11001011\,\underbrace{00000000\,00000000\ldots00000000}_{142\;\mathrm{ bytes}}\,00000001\,\underbrace{00000000\,00000000\ldots00000000}_{56\;\mathrm{ bytes}}}$$
At the first (and only) execution of step 8, $Z$ is the first $1152$ bits of the output of $f$, and at step 9 the results is the first $224$ bits of that.
If we assemble the bits into bytes per the little-endian convention and express each byte in big-endian hexadecimal (per FIPS 202 appendix B.1 Algorithm 11), we get that $\operatorname{SHA3-224}(M)$ for $M$ the 5-bit bitstring $\mathtt{11001}$ is the first $28$ bytes of the ($200$-byte) output of $\operatorname{KECCAK-}p[1600,24]$ for input the ($200$-byte) bytestring $\mathtt{D3\,\underbrace{00\,00\ldots00}_{142\;\mathrm{ bytes}}\,80\,\underbrace{00\,00\ldots00}_{56\;\mathrm{ bytes}}}$. This input matches the "complete padded msg" in the question and it's reference.
Computing $\operatorname{KECCAK-}p[1600,24]$ is just as in standard $\operatorname{SHA3}$. However implementations of $\operatorname{SHA3}$ and $\operatorname{SHAKE}$ usually do not give access to this function, much less support bit-sized messages. Adding that capability requires messing up with the internals of a library. I don't see how that could be done cleanly on top of OpenSSL libraries, as attempted by the question's code.
I wrote code (source that can be run online) for SHA3-224, SHA3-256, SHA3-384 and SHA3-512, with bit-sized message support. Here is the interface:
// Compute SHA3-<ol>, return 0 iff OK, 1 iff parameter error
// Supports message size in bit (if il%8!=0, only the low il%8 bits of the last byte are hashed)
int sha3(
uint8_t* op, int ol, // output, and it's length in bits among 224, 256, 384, 512
const uint8_t* ip, int64_t il // input, and it's length in bits
)
The single function is ~90 lines of portable C code, half of which for the Keccak permutation. It's optimized for concision, except a permutation round is unrolled to keep it fast. There's test code for 7 example messages for each of the 4 hashes, including the messages linked in the question, similar messages by NIST, and the message in comment in the question's code. Results match the 25 Known Answer Tests in these NIST references and the question.