Parameters:


Our current parameters are (note that this might change):


We select and implement four parameter sets: For NIST security level 1, we select two parameter sets: MAYO_one and MAYO_two, where MAYO_one has smaller public keys but larger signatures and conversely MAYO_two has smaller signatures but larger public keys. For NIST security level 3 and NIST security level 5, we select one parameter set each, which we refer to as MAYO_three and MAYO_five, respectively. The parameter sets and the corresponding key and signature sizes are displayed below. All sizes are reported in bytes -B-.


Parameter set MAYO_one MAYO_two MAYO_three MAYO_five
security level 1 1 3 5
secret key size 24 B 24 B 32 B 40 B
public key size 1420 B 4912 B 2986 B 5554 B
signature size 454 B 186 B 681 B 964 B

Cycle counts for our AVX2 optimized implementation:


The fastest results on the 2.0 GHz Ice Lake platform perform KeyGen in 55 μs, Signing (+ExpandSK) in 246 μs, and Verifying (+ExpandPK) in 77 μs with MAYO_one. Batch signing (without expandSK) is fastest with 126 μs and MAYO_two. Batch verification (without expandPK) is fastest with 30 μs and MAYO_two.


All builds use -O3 compiler optimization level and -march=native build architecture. Turbo Boost was deactivated to achieve consistent timings. We report the CPU cycles using AES-NI. More results can be found in our specification.


On Intel Xeon Gold 6338 CPU (Ice Lake) with 2.0 GHz for the AVX2 optimized implementation:


Scheme KeyGen ExpandSK + Sign ExpandPK + Verify
MAYO_one 118,704 471,028 153,266
MAYO_two 96,288 286,028 56,374
MAYO_three 282,446 1,017,216 347,972
MAYO_five 766,682 2,387,350 853,920

The library was compiled on Ubuntu with clang version 18.1.8. Results are the median of 1000 benchmark runs.



On Intel Xeon E3-1225 v3 CPU (Haswell) at 3.20GHz for the AVX2 optimized implementation:


Scheme KeyGen ExpandSK + Sign ExpandPK + Verify
MAYO_one 246,458 702,261 290,041
MAYO_two 153,420 375,493 96,606
MAYO_three 574,472 1,476,585 664,631
MAYO_five 1,338,275 3,475,547 1,488,808

The library was compiled on Ubuntu with clang version 18.1.3. Results are the median of 1000 benchmark runs.


On Intel Xeon E3-1260L v5 CPU (Skylake) at 2.90GHz for the optimized implementation:


Scheme KeyGen ExpandSK + Sign ExpandPK + Verify
MAYO_one 202,026 574,530 247,169
MAYO_two 157,929 327,303 96,266
MAYO_three 473,424 1,260,260 589,375
MAYO_five 1,197,019 3,037,778 11,336,685

The library was compiled on Ubuntu with clang version 14.0.0-1ubuntu1 20.04.5. Results are the median of 1000 benchmark runs.



Cycle counts for our optimized implementation:


All builds use -O3 compiler optimization level and -march=native build architecture. Turbo Boost was deactivated to achieve consistent timings. We report the CPU cycles using AES-NI. More results can be found in our specification.


On Intel Xeon Gold 6338 CPU (Ice Lake) with 2.0 GHz for the optimized implementation:


Scheme KeyGen ExpandSK + Sign ExpandPK + Verify
MAYO_one 878,654 2,724,374 353,682
MAYO_two 775,382 1,541,636 104,290
MAYO_three 3,485,616 8,009,012 877,112
MAYO_five 7,063,608 20,790,296 1,775,574

The library was compiled on Ubuntu with clang version 18.1.8. Results are the median of 1000 benchmark runs.



On Intel Xeon E3-1225 v3 CPU (Haswell) at 3.20GHz for the optimized implementation:


Scheme KeyGen ExpandSK + Sign ExpandPK + Verify
MAYO_one 1,052,858 3,171,356 530,101
MAYO_two 1,275,893 2,422,209 148,387
MAYO_three 3,230,269 10,725,576 1,313,842
MAYO_five 8,041,769 21,918,131 2,400,637

The library was compiled on Ubuntu with clang version 18.1.3. Results are the median of 1000 benchmark runs.


On Intel Xeon E3-1260L v5 CPU (Skylake) at 2.90GHz for the optimized implementation:


Scheme KeyGen ExpandSK + Sign ExpandPK + Verify
MAYO_one 899,214 2,797,221 441,935
MAYO_two 1,103,437 1,868,567 142,224
MAYO_three 2,998,299 8,615,942 1,119,046
MAYO_five 8,372,676 23,777,500 2,106,202

The library was compiled on Ubuntu with clang version 14.0.0-1ubuntu1 20.04.5. Results are the median of 1000 benchmark runs.


Arm Cortex-M4 implementation:


We are working on an Arm Cortex-M4 implementation. Results are shown below.

We use the ST NUCLEO-L4R5ZI development board which comes with a STM32L4R5ZI Cortex-M4 CPU with 2MB of flash memory and 640KB of SRAM.


All builds use -O3 compiler optimization level using the Arm GNU toolchain.


Scheme KeyGen ExpandSK + Sign ExpandPK + Verify
MAYO_one 9,572,510 16,911,188 11,242,036
MAYO_two 9,574,993 11,003,859 6,027,831
MAYO_three 27,170,014 45,428,633 28,446,049

Arm Neon implementation:


The NEON implementation is built with CMake option -DMAYO BUILD TYPE=neon. AES acceleration is used by default, if available. We have tested the implementation on Apple M1, M2, and M3 processors. We report the M3 results here.

Scheme KeyGen ExpandSK + Sign ExpandPK + Verify
MAYO_one 131,133 429,283 169,516
MAYO_two 125,437 279,380 72,482
MAYO_three 386,820 1,039,799 454,799
MAYO_four 895,085 2,217,257 948,399