I am programming an embedded chip that has a hardware 16-bit CRC module. I have to protect some data bytes $d_0,d_1,...,d_{n-1}$ against corruption caused by sudden loss of power; a 32-bit CRC would provide the level of protection that I need, but the chip doesn't have that capability.
Unfortunately the CRC module uses a hard-coded polynomial that can't be changed. And I need the computation to be as fast as possible. Perhaps a table-based CRC algorithm would be fast enough, but the table would occupy 1 kilobyte of ROM, which I want to avoid if I can.
So my question is this: Can I somehow use this hardware CRC-16 module to reach the level of security that I need?
My current idea is to compute $CRC_{\mathrm{forward}}$, the CRC-16 of the data bytes $d_0,d_1,...,d_{n-1}$, and $CRC_{\mathrm{backward}}$, the CRC-16 of the data bytes $d_{n-1},d_{n-2},...,d_0$, and simply concatenate these two 16-bit CRCs to create a 32-bit EDC. Is this sufficient? Or are these two CRCs in some sense dependent?
More precisely: Suppose $D$ is the data, and $E$ is a corrupted version with $E \ne D$. Then I want the probability that $CRC_{\mathrm{forward}}(E) = CRC_{\mathrm{forward}}(D)$ to be independent of the probability that $CRC_{\mathrm{backward}}(E) = CRC_{\mathrm{backward}}(D)$. Is this the case here, or have I missed something?
Edited to add: If the discrepancy $E \oplus D$ is palindromic, and $CRC_{\mathrm{forward}}(E) = CRC_{\mathrm{forward}}(D)$, then $CRC_{\mathrm{backward}}(E) = CRC_{\mathrm{backward}}(D)$. So this idea doesn't work. Does anybody have a better idea?