SM3 Cryptographic Hash Algorithm (Chinese Standard)


SM3 is 256-bit cryptographic hash algorithm derived from SHA-2 designed by the NSA. It was designed by Xiaoyun Wang who is responsible for discovering attacks against many cryptographic hash functions, most notably MD5 and SHA-1. At the CRYPTO 2004 conference, she and co-authors demonstrated collision attacks against MD5 and SHA-0. Attacks against SHA-1 were published later in 2005. SM3 was published in December 2007 by the Chinese National Cryptographic Administration Bureau as part of the Trusted Computing framework in China. SM3 uses a Merkle-Damgard construction like MD4, MD5, SHA-1 and SHA-2. SM means commercial cipher. The full list of cryptographic standards for commercial purposes are as follows.

  • SM2 – Elliptic curve algorithms for digital signatures and key exchange.
  • SM3 – A cryptographic hash.
  • SM4 – A block cipher.
  • SM9 – A set of identity-based cryptographic schemes from pairings.

Macros and data types

#define R(v,n)(((v)<<(n))|((v)>>(32-(n))))
#define F(n)for(i=0;i<n;i++)

#define rev32(x) __builtin_bswap32(x)
#define rev64(x) __builtin_bswap64(x)

typedef unsigned long long Q;
typedef unsigned int W;
typedef unsigned char B;

typedef struct _sm3_ctx {
    W s[8];
    union {
      B b[64];
      W w[16];
      Q q[8];
    Q len;


In the original SHA-2 specification, the 8 initialization values are fractional parts of square roots for the first 8 primes 2..19). See here for more detailed description on how to create these.

H_0^{(0)}\: =\: 6a09e667
H_1^{(0)}\: =\: bb67ae85
H_2^{(0)}\: =\: 3c6ef372
H_3^{(0)}\: =\: a54ff53a
H_4^{(0)}\: =\: 510e527f
H_5^{(0)}\: =\: 9b05688c
H_6^{(0)}\: =\: 1f83d9ab
H_7^{(0)}\: =\: 5be0cd19

In SM3, the following 8 values are used:

H_0^{(0)}\: =\: 7380166f
H_1^{(0)}\: =\: 4914b2b9
H_2^{(0)}\: =\: 172442d7
H_3^{(0)}\: =\: da8a0600
H_4^{(0)}\: =\: a96f30bc
H_5^{(0)}\: =\: 163138aa
H_6^{(0)}\: =\: e38dee4d
H_7^{(0)}\: =\: b0fb0e4e

Unfortunately, there’s no explanation for how, or why these values were selected.

void sm3_init(sm3_ctx*c) {    
    c->len =0;

Updating context

Updating the buffer and state is exactly the same as SHA-2 that is based on the original design for MD4 by Ron Rivest. Once the buffer has 64-bytes of data, it’s processed using sm3_compress

void sm3_update(sm3_ctx*c,const void*in,W len) {
    B *p=(B*)in;
    W i, idx;
    idx = c->len & 63;
    c->len += len;
    for (i=0;i<len;i++) {
      c->x.b[idx]=p[i]; idx++;
      if(idx==64) {


This step is also the exact same as SHA-2.

void sm3_final(void*h,sm3_ctx*c) {
    W i,len,*p=h;
    i = len = c->len & 63;
    while(i<64) c->x.b[i++]=0;
    if(len>=56) {


The main difference between SHA-2 and SM3 is in the compression function. In SHA-2, the 512-bit message is expanded into 64 32-bit words. For SM3, it’s expanded into 68 32-bit words. The following two linear functions are defined:

P_1 for expansion, P_0 for compression

P_0(X) = X\oplus (X\lll 9)\oplus (X\lll 17)
P_1(X) = X\oplus (X\lll 15)\oplus (X\lll 23)

#define P0(x)x^R(x,9)^R(x,17)
#define P1(x)x^R(x,15)^R(x,23)

The first part of expansion is similar to SHA-2, but uses different offsets.

\text{for}\: i=16\: \text{to}\: 67\\  \hphantom{---}W_i\leftarrow P_1(W_{i-16}\oplus W_{i-9}\oplus(W_{i-3}\lll 15))\oplus (W_{i-13}\lll 7)\oplus   W_{i-6}\\  \text{endfor}

\text{for}\: i = 0\: \text{to}\: 63\\  \hphantom{---}W_i'=W_i\oplus W_{i+4}\\  \text{endfor}

The following code performs the first loop of the expansion. The second loop is inlined with the main loop.

      w[i]=P1(w[i-16]^w[i-9]^R(w[i-3],15))^R(w[i-13],7)^w[i- 6];

The additional 4 words in SM3 are required for a separate expansion that I’ve inlined with the compression function to try minimize code usage.

W_i'=W_i\oplus W_{i+4}\: 0\leq i < 64.

You can see in main compression loop while creating TT1 and TT2.


For SHA-2, 64 words are used during compression which are described here in detail.

Both use the same sequence of sixty-four constant 32-bit words, K0{256}, K1{256}, …,K63{256} These words represent the first thirty-two bits of the fractional parts of the cube roots of the first sixty-four prime numbers. In hex, these constant words are (from left to right)

For SM3, only 2 constants are used. One is selected depending on the position of loop counter, then a circular shift by loop counter is applied before adding into the state. This certainly helps make SM3 more compact than SHA-2 although I can’t say if it strengthens in some way.

T_i =  \begin{cases}  79cc4519 & 0\leq i \leq 15 \\[1ex]  7a879d8a & 16\leq i \leq 63  \end{cases}

t = (i < 16) ? 0x79cc4519 : 0x7a879d8a;

There are 3 boolean operations used during the compression which are similar to bitwise majority MAJ and bitwise choice CH operations in SHA-2.

FF_i(X,Y,Z) =  \begin{cases}  X\oplus Y\oplus Z  & 0\leq i \leq 15 \\[1ex]  (X\wedge Y)\vee (X\wedge Z)\vee (Y\wedge Z) & 16\leq i \leq 63  \end{cases}

GG_i(X,Y,Z) =  \begin{cases}  X\oplus Y\oplus Z  & 0\leq i \leq 15 \\[1ex]  (X\wedge Y) \vee (\neg X \wedge Z) & 16\leq i \leq 63  \end{cases}

#define F1(x,y,z)(((x)^(y)^(z)))
#define FF(x,y,z)(((x)&(y))^((x)&(z))^((y)&(z))) 
#define GG(x,y,z)(((x)&(y))^(~(x)&(z)))

ABCDEFGH\leftarrow V^{(i)}
\begin{aligned}  \text{for}\; & i= 0\; \text{to}\: 63\\  & SS_1\leftarrow ((A\lll 12)+E+(T_i\lll i))\lll 7\\  & SS_2\leftarrow SS_1\oplus(A\lll 12)\\  & TT_1\leftarrow FF_i(A,B,C)+D+SS_2+W_i'\\  & TT_2\leftarrow GG_i(E,F,G)+H+SS_1+W_i\\  & D\leftarrow C\\  & C\leftarrow B\lll 9\\  & B\leftarrow A\\  & A\leftarrow TT_1\\  & H\leftarrow G\\  & G\leftarrow F\lll 19\\  & F\leftarrow E\\  & E\leftarrow P_0(TT_2)\\  \text{endfor}&\;  \end{aligned}

Updating state

At the end of compression function in SHA-2, the input hash is updated by adding the output from compression.


In SM3, it is instead XOR’d which follows a Davies-Meyer idea for compression functions.

H_0^{(i)}=a\oplus H_0^{(i-1)}
H_1^{(i)}=b\oplus H_1^{(i-1)}
H_2^{(i)}=c\oplus H_2^{(i-1)}
H_3^{(i)}=d\oplus H_3^{(i-1)}
H_4^{(i)}=e\oplus H_4^{(i-1)}
H_5^{(i)}=f\oplus H_5^{(i-1)}
H_6^{(i)}=g\oplus H_6^{(i-1)}
H_7^{(i)}=h\oplus H_7^{(i-1)}

The output state from compression is XORed with the previous hash value to produce the next hash value. In the first round when there is no previous hash value it uses a constant pre-specified initial values.

H_i=E_{m_i}(H_{i-1})\oplus H_{i-1}.

Here’s the full compression function.

void sm3_compress(sm3_ctx*c) {
    W t1,t2,i,j,t,s1,s2,x[8],w[68];

    // load data
    // expand
      w[i]=P1(w[i-16]^w[i-9]^R(w[i-3],15))^R(w[i-13],7)^w[i- 6];

    // load internal state
    // compress data
    F(64) {
      if(i<16) {
      } else {
    // update internal state

Sources here. Thanks to 0x4d_ for submitting \LaTeX equations.

This entry was posted in assembly, cryptography, programming, security and tagged , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s