Why is this code uniquely decodable?

Question

Source alphabet: $\{a, b, c, d, e, f\}$

Code alphabet: $\{0, 1\}$

$a\colon 0101$
$b\colon 1001$
$c\colon 10$
$d\colon 000$
$e\colon 11$
$f\colon 100$

I thought that for a code to be uniquely decodable, it had to be prefix-free. But in this code, the codeword $c$ is the prefix of codeword $f$ for example, so it is not prefix-free. However my textbook tells me that its reverse is prefix free (I don't understand this), and therefore it is uniquely decodable. Can someone explain what this means, or why it is uniquely decodable? I know it satisfies Kraft's inequality, but that is only a necessary condition, not a sufficient condition.

Prefix-free implies uniquely decodable, but that it is not an "if and only if" statement. See, for example, here. — dkaeae, Mar 03 '19 at 13:46
Okay I see, but my text book says this: Code A is uniquely decodable since its reverse it is prefixfree, so uniquely decodable
Do you understand what they mean by its reverse? — 2000mroliver, Mar 03 '19 at 13:47
Probably simply the code obtained by reversing all codewords. — dkaeae, Mar 03 '19 at 13:47
You can decode it by running the normal decoding algorithm, but going backwards through the string. — RemcoGerlich, Mar 04 '19 at 09:04
c may be a prefix of b and f, but the suffixes that are left over don't exist in the code. When you reverse the code, suffixes become prefixes, and then it becomes prefix-free. — Barmar, Mar 04 '19 at 17:12

Yuval Filmus · Accepted Answer · 2019-03-03T14:09:28.913

Your code has the property that if you reverse all codewords, then you get a prefix code. This implies that your code is uniquely decodable.

Indeed, consider any code $C = x_1,\ldots,x_n$ whose reverse $C^R := x_1^R,\ldots,x_n^R$ is uniquely decodable. I claim that $C$ is also uniquely decodable. This is because $$ w = x_{i_1} \ldots x_{i_m} \text{ if and only if } w^R = x_{i_m}^R \ldots x_{i_1}^R. $$ In words, decompositions of $w$ into codewords of $C$ are in one-to-one correspondence with decompositions of $w^R$ into codewords of $C^R$. Since the latter are unique, so are the former.

Since prefix codes are uniquely decodable, it follows that the reverse of a prefix code is also uniquely decodable. This is the case in your example.

The McMillan inequality states that if $C$ is uniquely decodable then $$ \sum_{i=1}^n 2^{-|x_i|} \leq 1. $$ In other words, a uniquely decodable code satisfies Kraft's inequality. Therefore if all you're interested in is minimizing the expected codeword length, there is no reason to look beyond prefix codes.

Sam Roweis gives in his slides a nice example of a uniquely decodable code which is neither a prefix code nor the reverse of a prefix code: $$ 0,01,110. $$ In order to show that this code is uniquely decodable, it suffices to show how to decode the first codeword of a word. If the word starts with a $1$, then the first codeword is $110$. If it is of the form $01^*$, then it must be either $0$ or $01$. Otherwise, there must be a prefix of the form $01^*0$. We now distinguish several cases:

$$ \begin{array}{c|cccc} \text{prefix} & 00 & 010 & 0110 & 01110 \\\hline \text{codeword} & 0 & 01 & 0 & 01 \end{array} $$ Longer runs of $1$ cannot be decoded at all.

In seems that in the OP's example, we cannot decode the first codeword after a fixed amount of digits, there are infinitely many cases: 1001010101010101… can be either fcccccc… or caaa…, and we might need to wait until the end of the input to decide. — Bergi, Mar 03 '19 at 21:58
@Bergi It is always decodable for any finite amount of digits. There is always only one way to decode the encoding without any remainders. Any other attempt will end up with spare 1's or 0s. This is because the code is uniquely decodable if we read it tail first. In theory if something is uniquely decodable in one direction it makes no sense that there can be more than one solution in the other direction — slebetman, Mar 04 '19 at 00:22
@slebetman I was referring to a finite prefix (with possible remainders). Yes, if we take the whole input it always is decodable. — Bergi, Mar 04 '19 at 11:40

score 5 · Answer 2 · answered Mar 04 '19 at 01:09

If I give you any message that you are supposed to decode, then you can do the following: Reverse the message, starting with the last bit instead of the first bit. Reverse the code words. Decode the message. Reverse the decoded string.

You can do that because after reversing the six code words, you get a prefix-free code: 1010, 1001, 01, 000, 11, 001 is prefix free.

score 0 · Answer 3 · answered Mar 03 '19 at 23:50

If prefix-free means what I think, the reverse of ‘a’ starts with 1, or 10, or 101, none of which is any other whole valid code.

Therefore, if a message ends with 0101, it can only be an ‘a’ and you can apply similar logic to the preceding bit(s).

However, what if there is no end to start from? Well, if the first bit is 1, you know it isn’t ‘a’ or ‘d’. The second bit will eliminate ‘e’ or {‘b’,’c’,’f’}. The third bit might bring it down to one choice, but if not, it is unique by the fourth bit.

As soon as you get to a unique sequence, you restart the algorithm on the next bit.

Why is this code uniquely decodable?

3 Answers3