Given one has a cyclic code, how would you deduplicate the other orientations of the codeword in a systematic way?

Question

I've been recently working with Reed-Solomon codes and wanted to make use of their cyclic properties to uniquely identify something regardless of where the reading of the code started symbol-wise. Is there a way of mapping the unique necklace of symbols to a number such that the alternate Reed-Solomon codewords that correspond to that necklace all collapse to that same output number? What would this be called if there is existing research on it?

Sorry, I'm primarily a software guy so my description of the problem might not be the best.

Here's a worked example of what I'm doing if it helps. I have a BCH view Reed-Solomon code constructed over GF(16) with fcr=2, primitive polynomial 10011, and systematic encoding. This code is cyclic with 15 symbols. I'll call this 15-phase, although I don't know if this is a proper term for this. Since 15 has factors 3 and 5, I can easily define variants that are 3-phase and 5-phase with less than the full number of symbols. For my use-case I'm focusing on the 3-phase ones.

With this information, I now have a cyclic code constructed of 3 symbols with 2 data symbols and 1 check symbol, which is effectively those same symbols at evenly spaced intervals with padding between them when viewed as a Reed-Solomon codeword.

If I encode (via inserting padding and running it through a Reed-Solomon encoder) the symbols 0, 1, the check symbol appended is 6 and if I encode the symbols 1, 6, the check symbol appended is 0 and likewise for 6, 0, the result is 1. Is there then a way to map down from the codewords 0,1,6; 1,6,0; and 6,0,1 to a number unique to the necklace? Failing that is there a way to algorithmically partition the codewords such that no partition contains any of the alternate rotations of it's constituent codewords with the exception of ones whose alternate rotations are the same codeword?

Please check out whether I interpreted your question correctly. For computer testing you may want to look at the next to last paragraph of my answer alone :-) — Jyrki Lahtonen, Nov 05 '23 at 08:51
Near as I can tell, yes you did though my own purely self-taught understanding of the math behind it isn't quite enough to fully understand your answer. For example I didn't realize that the third root of unity had any special properties. The last paragraph checks out at any rate and lines up with my observations. Guess I'll have to study up some more and maybe post a few more questions. Thanks! — Curtis, Nov 05 '23 at 17:00
I understand now why the third root of unity was important and now I'm properly able to encode and decode all 80 orientable necklaces and get back the orientation data with it. Your help was just what I needed. Thank you so much! — Curtis, Nov 06 '23 at 03:19

Jyrki Lahtonen · Accepted Answer · 2023-11-05T14:54:06.000

I think I managed to work out what the question is. I first need to translate it into the language of algebra. IMHO this is necessary for otherwise I cannot formulate an answer. In the end I make an attempt to remove as much of the algebra as possible.

The OP is using the variant of the field $GF(16)$ gotten from the prime field $GF(2)$ by adjoining a root $\gamma$ of the primitive polynomial $f(x)=x^4+x+1$. In other words, the arithmetic is determined by usual algebraic rules and the equations $1+1=0$ and $\gamma^4=\gamma+1$. The elements of the field $GF(16)$ should then be viewed as polynomials of $\gamma$ of the form $$z=b_0+b_1\gamma+b_2\gamma^2+b_3\gamma^3$$ with the coefficients $b_0,b_1,b_2,b_3$ independently ranging over $GF(2)=\{0,1\}$. In this old answer of mine I describe how all the non-zero elements of $\Bbb{F}_{16}=GF(16)$ can be written as powers of $\gamma$, a manifestation of the fact that $f(x)$ is a primitive polynomial.

In many computer implementations of the arithmetic of $GF(16)$ the element $z$ above is stored as the string of four bits like $$ z=b_3b_2b_1b_0. $$ So the multiplicative identity $1$ is stored as $0001$, the element $\gamma$ as $0010$, $\gamma^2$ as $0100$ et cetera. This has the huge practical benefit that addition in the field $GF(16)$ amounts to bitwise XOR of those strings of bits. Because a string of four bits is also used to represent integers in the range $[0,15]$, many programmers are want to refer to the elements of $GF(16)$ as integers in this range also. So for them $\gamma=0010=2$, $\gamma^2=0100=4$ etc.

In what follows I occasionally use the table of powers of $\gamma$ in the linked answer for quick arithmetic of $GF(16)$ — think of that table as a 2-way look-up-table (that's how I used to write all my code around smallish fields of characteristic two).

I next specify the element $\omega$: $$\omega:=\gamma^5=\gamma\cdot\gamma^4=\gamma(\gamma+1)=\gamma^2+\gamma=0110=6$$ that the definition of OP's code will absolutely need. Some key properties of $\omega$ are consequences of the fact that $$ \omega^3=(\gamma^5)^3=\gamma^{15}=1, $$ in other words, $\omega$ is a third root of unity in $GF(16)$. It therefore satisfies the equation $\omega^2+\omega+1=0$ shared by all third roots of unity in all algebraic structures. Consequently $$\omega^2=-\omega-1=\omega+1=0110+0001=0111=7.$$

Using a third root of unity we can construct cyclic codes of length three, which is what the OP seems to be using. They seem to be using the 2-dimensional RS-code $$ \mathcal{C}=\{(c_0,c_1,c_)\in GF(16)^3\mid c_0+\omega c_1+ \omega^2c_2=0\}. $$ That is a code defined by the check matrix $H=(1,\omega,\omega^2)=(1,6,7)$. The Reed-Solomon codes have several equivalent definitions. I will be using the one, where we produce the codewords by evaluating low degree polynomials in a cyclic subgroup of $GF(16)^*$. We get the code $\mathcal{C}$ by evaluating at most linear polynomials $m(x)=ax+b$ at the elements of the subgroup $\mu_3:=\{1,\omega,\omega^2\}\le GF(16)^*,$ so $$ \mathcal{C}=\{(m(1),m(\omega),m(\omega^2))\mid m(x)=ax+b, a,b\in GF(16)\}. $$ It is easy to see that the two descriptions of the code $\mathcal{C}$ agree. Because $1+\omega+\omega^2=0$ it follows that with $m(x)=ax+b$ $$ m(1)+\omega m(\omega)+\omega^2 m(\omega^2)=b(1+\omega+\omega^2)+a(1+\omega^2+\omega^4)=0+0=0 $$ giving one inclusion, the reverse inclusion follows from the fact that both definitions describe a 2-dimensional subspace of $GF(16)^3$.

For example, the polynomial $m(x)=\omega^2+\omega x$ yields the codeword $$ \begin{aligned} (m(1),m(\omega),m(\omega^2))&=(\omega^2+\omega,\omega^2+\omega^2,\omega^2+\omega^3)\\ &=(1,0,\omega^2+1)\\ &=(1,0,\omega)\\ &=(1,0,6) \end{aligned} $$ the OP used as an example.

The reason I brought up this alternative description is that it makes it easy to trace back the effect of cyclic shifts. If the polynomial $m(x)$ yields the codeword $(m(1),m(\omega),m(\omega^2))=(c_0,c_1,c_2)$, we see that the other linear polynomial $m(\omega^{-1}x)$ yields the codeword $$ (m(\omega^{-1}), m(1),m(\omega))=(m(\omega^2),m(1),m(\omega))=(c_2,c_0,c_1). $$ That is, the cyclic shift of the codeword we started with. All because $\omega^3=1$.

So in terms of associating codewords with polynomials we see that the cyclic shifts of $ax+b$ correspond to $a\omega^{-1}x+b=a\omega^2x+b$ and $a\omega x+b$. This, at long last, allows us to answer the OP's question of how to avoid cyclically shifted versions of the codewords. We see that all the cyclic shifts of $ax+b$ share the same constant term $b$, so we can choose that freely. But we need to constrain the coefficient $a$ of the linear term in such a way that if $a$ is allowed, then both $a\omega$ and $a\omega^2$ need to be disallowed. We make an exception to this by allowing $a=0$, because the corresponding codewords simply repeat the same symbol, and are immune to cyclic shifts. A way to achieve is to select $a$ from the subset $$S=\{0,1,\gamma,\gamma^2,\gamma^3,\gamma^4=\gamma+1\}.$$ With $a\in S\setminus\{0\}$ the coefficient $a\omega$ covers the range $\gamma^j$, $5\le j<10$ and the coefficient $a\omega^2$ the range $\gamma^j$, $10\le j<15$, and between them they cover all the choices.

All of the above translates into the following simple description of representatives of the words of this code is thus that we can use the usual generator matrix $$ G=\left(\begin{array}{ccc}1&1&1\\ 1&\omega&\omega^2\end{array}\right), $$ but while we can choose the the multiplier of the first row freely from $GF(16)$, the multiplier of the latter row must be constrained to come from the set $$ S=\{0,1,\gamma,\gamma^2,\gamma^3,\gamma^4\}=\{0,1,2,4,8,3\}. $$ In the "integer" representation we needed $\gamma^4=\gamma+1=0011=3$.

As a final reality check let's carry out a census. The code has $16^2=256$ codewords. Sixteen of them simply repeat the same symbol thrice (corresponding to the choice $0\in S$). The remaining $256-16=240$ are divided into $80=240/3$ groups as they have three cyclic shifts each. Those $80$ codewords come from choosing the multiplier of the first row of $G$ in any sixteen ways, but the multiplier of the second row has to be one of the five non-zero elements of $S$. $80=16\cdot5$, so this checks out.

Hi there. I have one unanswered question. Would you like to see this question: https://math.stackexchange.com/questions/2531008/subextension-of-a-field-with-galois-series-of-subextensions-of-prime-degree Thank you very much for your help! — Hermi, Nov 15 '23 at 03:34

Given one has a cyclic code, how would you deduplicate the other orientations of the codeword in a systematic way?

1 Answers1