If "bit-serial and bitslice(d) are equated", the question is about what I'd call bytesliced AES, by analogy with bitsliced AES. That carries $k$ simulataneous AES operations on a machine with (I'll assume, exactly) $8k$-bit words, and uses steps that compute on $k$ bytes in parallel. Another slightly different possibility is that the question is about SIMD implementation of AES on hardware with $k$ bytewide ALUs.
The $k$ input blocks of 16 bytes are split into 16 words, each concatenating the bytes of a given rank in the input blocks. Like in bitsliced AES, ShiftRows thus reduces to selection of the appropriate word for the next step. AddRoundKey reduces to XOR with a word consisting of the same byte repeated $k$ times. More generally, when there's an addition of a byte in $\mathbb F_{2^8}$ prescribed by AES, we can perform that for all $k$ AES instances with a single word XOR.
In MixColumns, the same multiplicative coefficient in $\{1,2,3\}$ is applied to all bytes of a given word, easing implementation. Ideally there would be hardware support for parallel byte-wide arithmetic in $\mathbb F_{2^8}$ but lacking that, it's still possible to be fairly efficient in a high level language. e.g. for $k=8$ (64-bit words), multiplication in $\mathbb F_{2^8}$ of the bytes in w
by $2$, could I think (not tested) go:
w = ( (0x8080808080808080 - (w>>7 & 0x0101010101010101)) & 0x1B1B1B1B1B1B1B1B
) ^ (w<<1 & 0xFEFEFEFEFEFEFEFE);
Note: a SIMD implementation can just use the usual
b = ((-(b>>7)) & 0x1B) ^ (b<<1);
The one difficult step is SubBytes, if there's no hardware support for it. I suspect some of the techniques there allow to slightly improve on going full bitwise, but I have nothing canned to propose. What's optimum surely depends on the available hardware.
Like I understand, normal AES ist worparallel wich splits an input into 16 bytes. Byte-Serial uses 16 different inputs and Bit-slice uses 128 different inputs
– ChopaChupChup Aug 01 '22 at 12:30"As many encryption algorithms evaluate boolean functions during their execution, bit-serial (sometimes called bit-slice) computing in SIMD..."
Eitschberger and Keller compared two implementations of AES, a bit-serial and word-parallel and made it possible to transform between the implementations. My task is now to implement the byte-serial method. And from here on my questions from above arise.
– ChopaChupChup Aug 01 '22 at 13:06