djbsort

On an Intel Haswell CPU core, djbsort sorts an array of 1024 signed 32-bit integers in L1 cache in 7328 cycles. The time is independent of the contents of the array. Previous software (see the separate list of references) takes much more time, even without the constant-time requirement.

djbsort's int32-speed tool outputs cycle counts for sorting arrays of various sizes. Each array size is sorted 127 times so that median cycle counts filter out occasional interrupts. The current output format is as follows:

The output format is subject to change.

Output of int32-speed on titan0, a 3.5GHz Intel Xeon E3-1275 v3 (Haswell), running clang 3.8.0-2ubuntu4 as supplied with Ubuntu 16.04.4:

int32 implementation int32/avx2
int32 version 20180729
int32 compiler clang -fPIC -Wall -fomit-frame-pointer -fwrapv -Qunused-arguments -mavx2 -O3
overhead 0 24 24 28
int32 1 28 28 32 7.000000 7.000000 8.000000
int32 2 32 32 32 4.000000 4.000000 4.000000
int32 4 40 40 40 2.500000 2.500000 2.500000
int32 8 108 124 140 3.375000 3.875000 4.375000
int32 16 92 100 112 1.437500 1.562500 1.750000
int32 32 240 252 260 1.875000 1.968750 2.031250
int32 64 300 304 404 1.171875 1.187500 1.578125
int32 128 672 688 1180 1.312500 1.343750 2.304688
int32 256 1468 1552 1576 1.433594 1.515625 1.539062
int32 512 3388 3424 3428 1.654297 1.671875 1.673828
int32 1024 7304 7328 7340 1.783203 1.789062 1.791992
int32 2048 17720 17776 17804 2.163086 2.169922 2.173340
int32 4096 39392 39552 39600 2.404297 2.414062 2.416992
int32 8192 88988 89140 89356 2.715698 2.720337 2.726929
int32 16384 298248 298748 299088 4.550903 4.558533 4.563721
int32 32768 667068 667448 667928 5.089325 5.092224 5.095886
int32 65536 1693692 1695844 1698280 6.460922 6.469131 6.478424
int32 131072 4599276 4602416 4606500 8.772423 8.778412 8.786201
int32 262144 10212520 10217052 10220260 9.739418 9.743740 9.746799
int32 524288 23274752 23281004 23287680 11.098267 11.101248 11.104431
int32 1048576 50765496 50777572 50784376 12.103437 12.106317 12.107939
int32 3 48 56 64 4.000000 4.666667 5.333333
int32 6 76 96 104 3.166667 4.000000 4.333333
int32 12 132 132 136 2.750000 2.750000 2.833333
int32 24 248 264 308 2.583333 2.750000 3.208333
int32 48 356 364 364 1.854167 1.895833 1.895833
int32 96 728 732 732 1.895833 1.906250 1.906250
int32 192 1628 1652 1664 2.119792 2.151042 2.166667
int32 384 2924 2960 2984 1.903646 1.927083 1.942708
int32 768 6512 6512 6532 2.119792 2.119792 2.126302
int32 1536 13712 13748 13792 2.231771 2.237630 2.244792
int32 3072 30376 30392 30424 2.472005 2.473307 2.475911
int32 6144 68368 68424 68472 2.781901 2.784180 2.786133
int32 12288 157336 157536 157864 3.201009 3.205078 3.211751
int32 24576 451056 451436 451916 4.588379 4.592244 4.597127
int32 49152 1115144 1115888 1116588 5.671916 5.675700 5.679260
int32 98304 2705992 2707984 2710312 6.881694 6.886759 6.892680
int32 196608 7058496 7062536 7065668 8.975342 8.980479 8.984461
int32 393216 16489248 16492960 16496652 10.483582 10.485942 10.488289
int32 786432 36862084 36867432 36875624 11.718141 11.719841 11.722445

Version: This is version 2018.07.29 of the "Speed" web page.