SM4 block cipher (Chinese Standard for WAPI)

Introduction

SM4 (formerly SMS4) is a 128-bit block cipher with a 128-bit user key and 32 rounds. It’s used in the WLAN Authentication and Privacy Infrastructure (WAPI), a Chinese WLAN national standard.

It was published or rather declassified in 2006 and standardized in 2012.

I’m not clear what the SM stands for. There are other specifications like SM2 which describes Elliptic Curve algorithms for digital signatures and key exchange. SM3 which is a cryptographic hash algorithm and of course, SM4 a symmetric block cipher which I’ll implement here in C and x86 assembly optimized for size.

The only other specification to mention is SM9 which documents a set of identity-based cryptographic schemes from pairings.

English translations of the specifications for SM2, SM3 have been provided by Sean Shen and Xiaodong Lee at China Internet Network Information Center (CNNIC) a non-profit organization based in China.

But the only english translation for SM4 was by Whitfield Diffie and Dr. George Ledin.

About the C code

If you want high performance implementations of SM4 in C and x86/x86-x64 assembly, please look at GMSSL which appears to be a fork of OpenSSL but includes SM2, SM3 SM4 and SM9.

Mixer-substitution T

T is a substitution that generates 32 bits of output from 32 bits of input. Source code includes the full 256-byte array which obviously increases the size of code considerably.

  • Non-Linear function

(b_0, b_1, b_2, b_3) = \tau (A) = (Sbox(a_0),Sbox(a_1),Sbox(a_2),Sbox(a_3))

The Sbox is a 256-byte array and there’s no description of how these elements were chosen. If anyone knows, please leave a comment.

sbox

  • Linear function (for key setup)

L'(B) = B \oplus (B \lll 13)\oplus (B \lll 23)

  • Linear function (for encryption)

C=L(B)=B\oplus(B\lll 2)\oplus (B\lll 10)\oplus (B\lll 18)\oplus(B\lll 24)

t_function

t_l0:
    pop    ebx
    xor    eax, x1
    xor    eax, x2
    xor    eax, x3
    ; apply non-linear substitution
    mov    cl, 4
t_l1:    
    xlatb
    ror    eax, 8
    loop   t_l1
    mov    ebx, eax
    mov    ecx, eax
    mov    edx, eax
    mov    ebp, eax
    ; apply linear substitution
    popfd
    jc     t_l2
    ; for key setup
    rol    ebx, 13
    rol    ecx, 23
    xor    eax, ebx
    xor    eax, ecx
    jmp    t_l3
t_l2:    
    ; for encryption
    rol    ebx, 2
    rol    ecx, 10
    rol    edx, 18
    rol    ebp, 24
    
    xor    eax, ebx
    xor    eax, ecx
    xor    eax, edx
    xor    eax, ebp
t_l3:
    mov    [esp+_eax], eax    
    popad
    ret
    
; in:  eax
; out: eax  
T_function:
    pushad
    pushfd
    call   t_l0  ; pushes address of sbox on stack
    ; sbox for SM4 goes here

The round function F

F(X_0,X_1,X_2,X_3,rk)=X_0\oplus T(X_1\oplus X_2\oplus X_3\oplus rk)

The value of 1 in last parameter to T function tells it to use the linear function for encryption. In x86 assembly, I use the Carry Flag (CF) setting or clearing with STC and CLC instructions.

f_code

The constant parameter CK

(ck_{i,0},ck_{i,1},ck_{i,2},ck_{i,3}) \in \Big(Z_2^8\Big)^4,\quad then\quad ck_{ij}=(4i+j)\times 7\: (mod \: 256)

You can include the precomputed array.

ck_constants

Or generate at runtime using some code.

ck_code

; expects ecx to hold index
; returns constant in eax
CK:
    pushad
    xor    eax, eax          ; ck = 0
    cdq                      ; j  = 0
ck_l0: 
    shl    eax, 8            ; ck <<= 8
    lea    ebx, [ecx*4+edx]  ; ebx = (i*4) + j
    imul   ebx, ebx, 7       ; ebx *= 7
    or     al, bl            ; ck |= ebx %= 256
    inc    edx               ; j++
    cmp    edx, 4            ; j<4
    jnz    ck_l0
    mov    [esp+_eax], eax   ; return ck
    popad
    ret

Key setup

setkey

sm4_setkeyx:
_sm4_setkeyx:
    pushad
    mov    edi, [esp+32+4]  ; edi = ctx
    mov    esi, [esp+32+8]  ; esi = 128-bit key
    ; load the key
    lodsd
    bswap  eax
    xchg   eax, rk0
    lodsd
    bswap  eax
    xchg   eax, rk1
    lodsd
    bswap  eax
    xchg   eax, rk2
    lodsd
    bswap  eax
    xchg   eax, rk3
    
    ; xor FK values
    xor    rk0, 0xa3b1bac6    
    xor    rk1, 0x56aa3350    
    xor    rk2, 0x677d9197    
    xor    rk3, 0xb27022dc
    xor    ecx, ecx
sk_l1:    
    call   CK
    clc
    call   T_function 
    xor    rk0, eax
    mov    eax, rk0
    stosd                ; rk[i] = rk0
    xchg   rk0, rk1
    xchg   rk1, rk2
    xchg   rk2, rk3
    inc    ecx           ; i++
    cmp    ecx, 32
    jnz    sk_l1       
    popad
    ret

Encryption

encrypt

sm4_encryptx:
_sm4_encryptx:
    pushad
    mov    edi, [esp+32+4] ; edi = ctx
    mov    esi, [esp+32+8] ; esi = buf
    push   esi ; save buffer for later
    
    ; load data
    lodsd
    bswap  eax
    xchg   eax, x0
    lodsd
    bswap  eax    
    xchg   eax, x1
    lodsd
    bswap  eax    
    xchg   eax, x2
    lodsd
    bswap  eax    
    xchg   eax, x3
    
    ; do 32 rounds
    push   32
    pop    ecx
e_l0:
    ; apply F round
    mov    eax, [edi] ; rk[i]
    scasd
    stc
    call   T_function     
    xor    x0, eax
    xchg   x0, x1
    xchg   x1, x2
    xchg   x2, x3
    loop   e_l0
    
    ; save data
    pop    edi
    xchg   eax, x3
    bswap  eax    
    stosd
    xchg   eax, x2
    bswap  eax    
    stosd
    xchg   eax, x1
    bswap  eax    
    stosd
    xchg   eax, x0
    bswap  eax    
    stosd    
    popad
    ret

Summary

If there was a way to generate the sbox on the fly, this could be a good cipher for resource constrained devices. The size of code using /O2 /Os switches resulted in 690 bytes. The assembly for just encryption is approx. 500 bytes.

If you want to contribute or just access full source code, see here

Thanks to 0x4d_ for \LaTeX formulas.

Advertisements
This entry was posted in assembly, cryptography, encryption, programming and tagged , , , , , , . Bookmark the permalink.

One Response to SM4 block cipher (Chinese Standard for WAPI)

  1. Pingback: Asmcodes: SM3 Cryptographic Hash Algorithm (Chinese Standard) | x86 crypto

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s