1

Please only look at for loop in source code below, especially SP1,SP2,SP3....... Is it similar to what S-box does?(Ignore the other)

int fval, work, right, leftt;
int round;
int keysi = 0;

leftt = inInts[0];
right = inInts[1];

work   = ((leftt >>  4) ^ right) & 0x0f0f0f0f;
right ^= work;
leftt ^= (work << 4);

work   = ((leftt >> 16) ^ right) & 0x0000ffff;
right ^= work;
leftt ^= (work << 16);

work   = ((right >>  2) ^ leftt) & 0x33333333;
leftt ^= work;
right ^= (work << 2);

work   = ((right >>  8) ^ leftt) & 0x00ff00ff;
leftt ^= work;
right ^= (work << 8);
right  = (right << 1) | ((right >> 31) & 1);

work   = (leftt ^ right) & 0xaaaaaaaa;
leftt ^= work;
right ^= work;
leftt  = (leftt << 1) | ((leftt >> 31) & 1);

for ( round = 0; round < 8; ++round )
    {
    work   = (right << 28) | (right >> 4);
    work  ^= keys[keysi++];
    fval   = SP7[ work        & 0x0000003f ];
    fval  |= SP5[(work >>  8) & 0x0000003f ];
    fval  |= SP3[(work >> 16) & 0x0000003f ];
    fval  |= SP1[(work >> 24) & 0x0000003f ];
    work   = right ^ keys[keysi++];
    fval  |= SP8[ work         & 0x0000003f ];
    fval  |= SP6[(work >>  8) & 0x0000003f ];
    fval  |= SP4[(work >> 16) & 0x0000003f ];
    fval  |= SP2[(work >> 24) & 0x0000003f ];
    leftt ^= fval;
    work   = (leftt << 28) | (leftt >>> 4);
    work  ^= keys[keysi++];
    fval   = SP7[ work         & 0x0000003f ];
    fval  |= SP5[(work >>  8) & 0x0000003f ];
    fval  |= SP3[(work >> 16) & 0x0000003f ];
    fval  |= SP1[(work >> 24) & 0x0000003f ];
    work   = leftt ^ keys[keysi++];
    fval  |= SP8[ work         & 0x0000003f ];
    fval  |= SP6[(work >>  8) & 0x0000003f ];
    fval  |= SP4[(work >> 16) & 0x0000003f ];
    fval  |= SP2[(work >> 24) & 0x0000003f ];
    right ^= fval;
    }

right  = (right << 31) | (right >> 1);
work   = (leftt ^ right) & 0xaaaaaaaa;
leftt ^= work;
right ^= work;
leftt  = (leftt << 31) | (leftt >> 1);
work   = ((leftt >>  8) ^ right) & 0x00ff00ff;
right ^= work;
leftt ^= (work << 8);
work   = ((leftt >>  2) ^ right) & 0x33333333;
right ^= work;
leftt ^= (work << 2);
work   = ((right >> 16) ^ leftt) & 0x0000ffff;
leftt ^= work;
right ^= (work << 16);
work   = ((right >>  4) ^ leftt) & 0x0f0f0f0f;
leftt ^= work;
right ^= (work << 4);
outInts[0] = right;
outInts[1] = leftt;
yogoyogo
  • 49
  • 4

1 Answers1

4

Other than the oddity/bug in final note, this seems to be a classical, speed-optimized implementation of standard DES (for what we see; there could be variations in the S-table content, exact bit twidling in permutation P, or key scheduling).

The code has:

  1. IP, final permutation and IP-1 per Richard Outerbridge's method.
  2. Pre-rotation by 1 bit before the round loop, undone afterwards, in order to need no shift (rather than a 1-bit rotation) for SP8, and shifts counts that most are multiples of 8 (which excellent compilers for 8-bit CPU love).
  3. Two DES rounds per partially unrolled round loop, so that the 8 loops perform 16 rounds.
  4. Pre-computed 6-bit subkeys grouped as two 32-bits words (with 8 unused bits in each), merged in parallel for 4 S-boxes at a time. Correspondingly, it is applied subkeys on the input of expansion E.
  5. Expansion E reduced to rotating the appropriate part of right or left into the low-order 6 bits, and masking to keep these bits before indexing the S tables. This uses that E is highly regular, and combines with E.
  6. Permutation P merged with the S-boxes, which is the game-changing optimization of DES in software. SPj is DES's S-box j with expansion E (modified to account for the 1-bit pre-rotation) applied on the output, that is indexing SPj[x] yield E(SPj(x)), give or take the 1-bit pre-rotation. A table's output width thus grows from 4-bit to 32-bit. There's a data size penalty: S-boxes now eat 16384 bits of table rather than 2048.
    Note: It is not used the size optimization that keeps the bitwise-OR of the eight 32-bit tables and extracts the bits with eight 32-bit masks, which used to be classical in heavily size-constrained CPUs.

Note: a variant does without fval, clarifying per my taste. I sneaked in some comments.

for ( round = 0; round < 8; ++round )
    {
    // apply E, subkeys, S and P for odd  S-boxes, rounds changing leftt
    work   = keys[keysi++] ^ ((right << 28) | (right >> 4)); 
    leftt ^= SP7[ work        & 0x3f ];
    leftt ^= SP5[(work >>  8) & 0x3f ];
    leftt ^= SP3[(work >> 16) & 0x3f ];
    leftt ^= SP1[(work >> 24) & 0x3f ];
    // apply E, subkeys, S and P for even S-boxes, rounds changing leftt
    work   = keys[keysi++] ^ right;
    leftt ^= SP8[ work        & 0x3f ];
    leftt ^= SP6[(work >>  8) & 0x3f ];
    leftt ^= SP4[(work >> 16) & 0x3f ];
    leftt ^= SP2[(work >> 24) & 0x3f ];
    // apply E, subkeys, S and P for odd  S-boxes, rounds changing right
    work   = keys[keysi++] ^ ((leftt << 28) | (leftt >> 4));
    right ^= SP7[ work        & 0x3f ];
    right ^= SP5[(work >>  8) & 0x3f ];
    right ^= SP3[(work >> 16) & 0x3f ];
    right ^= SP1[(work >> 24) & 0x3f ];
    // apply E, subkeys, S and P for even S-boxes, rounds changing right
    work   = keys[keysi++] ^ leftt;
    right ^= SP8[ work        & 0x3f ];
    right ^= SP6[(work >>  8) & 0x3f ];
    right ^= SP4[(work >> 16) & 0x3f ];
    right ^= SP2[(work >> 24) & 0x3f ];
    }

Off-topic note: The question's code contains the idiom >>> which is an indication that it is it Java (where this means logical shift right).
If so, some >> needs to be changed to >>>, at least in (right << 28) | (right >> 4) and similar, because in Java >> is specified to leave the high-order bit of (32-bit) int unchanged, which is not the intention.
On the other hand, if this is C, all int should be changed to unsigned, or better uint32_t, because using >> on a signed variable is not fully specified in C, giving results that depend on the compiler's mood.

fgrieu
  • 140,762
  • 12
  • 307
  • 587