A Message Authentication Code or MAC is a cryptographic primitive that enables two parties using a shared secret key to authenticate messages exchanged between them.
MACs are used daily to secure online communications and will inevitably become a critical component of embedded systems in the future that have wireless functionality; specifically passive, low-power and IoT devices.
Some MACs are constructed from cryptographic hash algorithms. The Hash-based message authentication code (HMAC) published in 1996 for example recommended using MD5 or SHA-1. However, due to resources required for these hash functions, they were never suitable for microcontrollers.
Other MACs are constructed from block ciphers and this is more ideal for a developer where ROM and RAM are in short supply since there’s already a number of lightweight block ciphers available.
There’s no shortage of cryptographic solutions for desktop computers and mobile devices, however there still remains a significant problem with some of the current standardized algorithms when attempting to implement for resource constrained software or hardware devices.
A draft Report on Lightweight Cryptography published by NIST in August 2016 states that none of the NIST approved hash functions are suitable for resource constrained environments based on the findings in A Study on RAM Requirements of Various SHA-3 Candidates on Low-cost 8-bit CPUs
There’s currently an effort on part of engineers, cryptographers and organizations to design, evaluate and standardize lightweight cryptographic primitives for the embedded industry.
Whether we agree or disagree on the need to embed wireless devices in everyday objects, there will be many that have networking or wireless functionality in future which will require robust encryption solutions.
All code and details of the algorithm shown here are derived from reference implementations provided by Atul here on github
Any developers that require a reference implementation should use those source codes instead of what’s shown here.
Instead of using PRESENT block cipher, I’ll use SPECK64/128 due to the ARX design which is much easier to implement on x86 architecture.
Block Cipher Parameters
LightMAC is a mode of operation and depends directly on an external block cipher.
The only other block ciphers that come close to the size of Speck would be XTEA or Chaskey. The following is a list of parameters recommended for use with Speck.
The authors define the following in reference material which are based on the block cipher.
Length of the protected counter sum in bytes. Not greater than N/2.
Length of tag in bytes. Should be at least 64-bits but not greater than N.
Length of block in bytes. Same as N.
Length of block cipher key. The MAC key is twice this length.
The following table shows some example parameters using existing lightweight block ciphers.
|Cipher (E)||Block Length (N)||Cipher Key Length||MAC Key Length (K)||Counter Length (S)||Tag Length (T)|
In the specification, V denotes an intermediate/local variable of cipher block size N. It is initialized to zero and updated after every encryption using an XOR operation with ciphertext before returning the result in T (truncated if required)
But in my own implementation, I assume T to be of cipher block size N and initialize it to zero. I then update T instead which is why I prefer readers use the reference implementation instead of what’s shown here. 🙂
My reason for not allocating space for V and using T directly is simply to reduce the amount of code required for the assembly code.
The update process is very similar to what you see used in cryptographic hash algorithms. I was gonna have a more detailed description here but I think comments should be clear enough.
An end bit (0x80) is appended to M buffer along with any data remaining or none if the input length was a multiple of the block cipher length.
This is then XORed with any previous cipher block state before being encrypted with the 2nd key before returning.
x86 Assembly code
First, here’s the SPECK block cipher using 64-bit block and 128-bit key.
%define SPECK_RNDS 27 ; ***************************************** ; Light MAC parameters based on SPECK64-128 ; ; N = 64-bits ; K = 128-bits ; %define COUNTER_LENGTH 4 ; should be <= N/2 %define BLOCK_LENGTH 8 ; equal to N %define TAG_LENGTH 8 ; >= 64 && <= N %define BC_KEY_LENGTH 16 ; K %define ENCRYPT speck64_encryptx %define LIGHTMAC_KEY_LENGTH BC_KEY_LENGTH*2 ; K*2 %define k0 edi %define k1 ebp %define k2 ecx %define k3 esi %define x0 ebx %define x1 edx ; esi = input ; ebp = key speck64_encryptx: pushad push esi ; save M lodsd ; x0 = x->w xchg eax, x0 lodsd ; x1 = x->w xchg eax, x1 mov esi, ebp ; esi = key lodsd xchg eax, k0 ; k0 = key lodsd xchg eax, k1 ; k1 = key lodsd xchg eax, k2 ; k2 = key lodsd xchg eax, k3 ; k3 = key xor eax, eax ; i = 0 spk_el: ; x0 = (ROTR32(x0, 8) + x1) ^ k0; ror x0, 8 add x0, x1 xor x0, k0 ; x1 = ROTL32(x1, 3) ^ x0; rol x1, 3 xor x1, x0 ; k1 = (ROTR32(k1, 8) + k0) ^ i; ror k1, 8 add k1, k0 xor k1, eax ; k0 = ROTL32(k0, 3) ^ k1; rol k0, 3 xor k0, k1 xchg k3, k2 xchg k3, k1 ; i++ inc eax cmp al, SPECK_RNDS jnz spk_el pop edi xchg eax, x0 ; x->w = x0 stosd xchg eax, x1 ; x->w = x1 stosd popad ret
You might notice how ctr and idx variables are initialized to zero at the same time using CDQ instruction. Once PUSHAD is executed, it preserves EDX on the stack and is then used as the protected counter sum S.
Although we convert the counter to big endian format before saving in block buffer, it wouldn’t affect the security to skip this. I’ve retained it for compatibility with reference but might remove it later.
; void lightmac_tag(const void *msg, uint32_t msglen, ; void *tag, void* mkey) lightmac_tagx: _lightmac_tagx: pushad lea esi, [esp+32+4]; esi = argv lodsd xchg eax, ebx ; ebx = msg lodsd cdq ; ctr = 0, idx = 0, xchg eax, ecx ; ecx = msglen lodsd xchg eax, edi ; edi = tag lodsd xchg eax, ebp ; ebp = mkey pushad ; allocate N-bytes for M ; zero initialize T mov [edi+0], edx ; t->w = 0; mov [edi+4], edx ; t->w = 0; ; while we have msg data lmx_l0: mov esi, esp ; esi = M jecxz lmx_l2 ; exit loop if msglen == 0 lmx_l1: ; add byte to M mov al, [ebx] ; al = *data++ inc ebx mov [esi+edx+COUNTER_LENGTH], al inc edx ; idx++ ; M filled? cmp dl, BLOCK_LENGTH - COUNTER_LENGTH ; --msglen loopne lmx_l1 jne lmx_l2 ; add S counter in big endian format inc dword[esp+_edx]; ctr++ mov eax, [esp+_edx] ; reset index cdq ; idx = 0 bswap eax ; m.ctr = SWAP32(ctr) mov [esi], eax ; encrypt M with E using K1 call ENCRYPT ; update T lodsd ; t->w ^= m.w; xor [edi+0], eax lodsd ; t->w ^= m.w; xor [edi+4], eax jmp lmx_l0 ; keep going lmx_l2: ; add the end bit mov byte[esi+edx+COUNTER_LENGTH], 0x80 xchg esi, edi ; swap T and M lmx_l3: ; update T with any msg data remaining mov al, [edi+edx+COUNTER_LENGTH] xor [esi+edx], al dec edx jns lmx_l3 ; advance key to K2 add ebp, BC_KEY_LENGTH ; encrypt T with E using K2 call ENCRYPT popad ; release memory for M popad ; restore registers ret
The x86 assembly generated by MSVC using /O2 /Os is 238 bytes. The x86 assembly written by hand is 152 bytes.
In order for developers to benefit from LightMAC on microcontrollers, they should choose a lightweight block cipher but not necessarily SPECK. It’s only used here for illustration.
See lmx32.asm and lightmac.c for any future updates.