Published: fre 22 april 2022
By Stefan Israelsson Tampe
In computers .
We did some series of benchmarking between the fast assoc taking advantage of SIMD, the fast hash taking advantage of SIMD, normal assoc and normal guile hash. The keys used was integers and we performd tests for different sizes (N) with a differnt sizes subsample (M).
Here is the table for uniform uniform subsample respresenting a uniform hash distribution e.g. we will be penalized for having large data structures due to low cache reuse.
\begin{array}{l|l|l|l|l|l}
N & M & FASTLIST & ASSOC & FASTHASH & HASH \\
\hline
16 & 4 & 1.74 & 2.37 & 2.13 & 2.78\\
16 & 8 & 1.73 & 2.49 & 2.12 & 3.02\\
16 & 16 & 1.86 & 2.55 & 2.11 & 3.17\\
32 & 4 & 1.78 & 3.12 & 2.10 & 2.76\\
32 & 8 & 1.75 & 3.61 & 2.11 & 3.04\\
32 & 16 & 1.76 & 3.84 & 2.12 & 3.17\\
32 & 32 & 1.90 & 3.99 & 2.12 & 3.14\\
64 & 4 & 1.81 & 5.39 & 2.12 & 3.29\\
64 & 8 & 1.84 & 5.96 & 2.12 & 3.30\\
64 & 16 & 1.83 & 6.25 & 2.12 & 3.16\\
64 & 32 & 1.83 & 6.30 & 2.12 & 3.16\\
64 & 64 & 2.00 & 6.35 & 2.12 & 3.09\\
128 & 4 & 1.93 & 8.61 & 2.12 & 2.76\\
128 & 8 & 1.96 & 9.73 & 2.13 & 3.02\\
128 & 16 & 1.96 & 10.08 & 2.12 & 3.21\\
128 & 32 & 2.02 & 10.39 & 2.12 & 3.29\\
128 & 64 & 2.24 & 10.39 & 2.11 & 3.13\\
128 & 128 & 2.12 & 10.43 & 2.11 & 3.19\\
256 & 4 & 2.09 & 14.80 & 2.12 & 3.25\\
256 & 8 & 2.24 & 16.72 & 2.10 & 3.06\\
256 & 16 & 2.38 & 17.68 & 2.11 & 2.89\\
256 & 32 & 2.35 & 18.10 & 2.10 & 3.15\\
256 & 64 & 2.25 & 18.36 & 2.12 & 3.12\\
256 & 128 & 2.41 & 18.47 & 2.10 & 3.32\\
256 & 256 & 2.32 & 18.53 & 2.10 & 3.29\\
\end{array}
we continue with even higher sizes, but then only for the hash and fast hash.
\begin{array}{l|l|l|l}
N & M & FASTHASH & HASH \\
\hline
2048 & 4 & 2.11 & 2.85\\
2048 & 8 & 2.12 & 3.04\\
2048 & 16 & 2.10 & 3.44\\
2048 & 32 & 2.17 & 3.55\\
2048 & 64 & 2.12 & 3.48\\
2048 & 128 & 2.12 & 3.43\\
2048 & 256 & 2.13 & 3.43\\
2048 & 512 & 2.13 & 3.54\\
2048 & 1024 & 2.14 & 3.45\\
2048 & 2048 & 2.13 & 3.40\\
4096 & 4 & 2.12 & 2.74\\
4096 & 8 & 2.12 & 3.00\\
4096 & 16 & 2.18 & 3.18\\
4096 & 32 & 2.15 & 3.27\\
4096 & 64 & 2.18 & 3.30\\
4096 & 128 & 2.16 & 3.47\\
4096 & 256 & 2.17 & 3.56\\
4096 & 512 & 2.17 & 3.54\\
4096 & 1024 & 2.15 & 3.41\\
4096 & 2048 & 2.15 & 3.41\\
4096 & 4096 & 2.14 & 3.41\\
8192 & 4 & 2.11 & 3.22\\
8192 & 8 & 2.11 & 3.35\\
8192 & 16 & 2.15 & 3.26\\
8192 & 32 & 2.19 & 3.44\\
8192 & 64 & 2.28 & 3.61\\
8192 & 128 & 2.21 & 3.52\\
8192 & 256 & 2.17 & 3.49\\
8192 & 512 & 2.15 & 3.66\\
8192 & 1024 & 2.15 & 3.55\\
8192 & 2048 & 2.17 & 3.54\\
8192 & 4096 & 2.19 & 3.51\\
8192 & 8192 & 2.17 & 3.49\\
16384 & 4 & 2.11 & 2.85\\
16384 & 8 & 2.11 & 3.25\\
16384 & 16 & 2.17 & 3.21\\
16384 & 32 & 2.20 & 3.59\\
16384 & 64 & 2.35 & 3.66\\
16384 & 128 & 2.26 & 3.52\\
16384 & 256 & 2.30 & 3.65\\
16384 & 512 & 2.34 & 3.88\\
16384 & 1024 & 2.23 & 3.75\\
16384 & 2048 & 2.25 & 3.71\\
16384 & 4096 & 2.29 & 3.95\\
16384 & 8192 & 2.28 & 3.97\\
16384 & 16384 & 2.19 & 4.01\\
32768 & 4 & 2.22 & 3.19\\
32768 & 8 & 2.20 & 3.24\\
32768 & 16 & 2.17 & 3.23\\
32768 & 32 & 2.18 & 3.42\\
32768 & 64 & 2.47 & 3.63\\
32768 & 128 & 2.68 & 3.85\\
32768 & 256 & 2.50 & 4.11\\
32768 & 512 & 2.61 & 4.25\\
32768 & 1024 & 2.57 & 4.46\\
32768 & 2048 & 2.45 & 4.34\\
32768 & 4096 & 2.37 & 4.36\\
32768 & 8192 & 2.36 & 4.98\\
32768 & 16384 & 2.27 & 4.55\\
32768 & 32768 & 2.23 & 4.76\\
65536 & 4 & 2.12 & 2.88\\
65536 & 8 & 2.12 & 3.06\\
65536 & 16 & 2.18 & 3.33\\
65536 & 32 & 2.19 & 3.58\\
65536 & 64 & 2.34 & 3.71\\
65536 & 128 & 2.71 & 3.71\\
65536 & 256 & 2.58 & 4.24\\
65536 & 512 & 3.17 & 4.80\\
65536 & 1024 & 2.87 & 5.02\\
65536 & 2048 & 2.71 & 4.82\\
65536 & 4096 & 2.55 & 5.49\\
65536 & 8192 & 2.62 & 6.15\\
65536 & 16384 & 2.70 & 6.33\\
65536 & 32768 & 2.59 & 6.32\\
65536 & 65536 & 2.36 & 6.49\\
131072 & 4 & 2.12 & 2.75\\
131072 & 8 & 2.16 & 3.16\\
131072 & 16 & 2.14 & 2.95\\
131072 & 32 & 2.19 & 3.65\\
131072 & 64 & 2.57 & 3.57\\
131072 & 128 & 2.59 & 3.67\\
131072 & 256 & 2.87 & 4.08\\
131072 & 512 & 3.23 & 4.70\\
131072 & 1024 & 3.66 & 5.56\\
131072 & 2048 & 4.00 & 6.54\\
131072 & 4096 & 4.13 & 6.35\\
131072 & 8192 & 4.61 & 8.62\\
131072 & 16384 & 4.17 & 9.25\\
131072 & 32768 & 4.13 & 11.54\\
131072 & 65536 & 3.28 & 9.65\\
131072 & 131072 & 2.57 & 9.57\\
\end{array}
The next we simulate a hash function (used in python) where the actual working set is optimally located e.g. a dense subset of the hashtable.
First the small sizes,
\begin{array}{l|l|l|l|l|l}
N & M & FASTLIST & ASSOC & FASTHASH & HASH \\
\hline
16 & 4 & 1.57 & 1.62 & 1.26 & 2.61\\
16 & 8 & 1.57 & 1.84 & 1.27 & 3.09\\
16 & 16 & 1.70 & 2.34 & 1.79 & 2.94\\
32 & 4 & 1.57 & 1.61 & 1.27 & 2.56\\
32 & 8 & 1.56 & 2.00 & 1.27 & 2.81\\
32 & 16 & 1.70 & 2.34 & 1.27 & 2.80\\
32 & 32 & 1.77 & 3.76 & 1.78 & 2.88\\
64 & 4 & 1.55 & 1.61 & 1.26 & 2.52\\
64 & 8 & 1.56 & 1.83 & 1.27 & 3.05\\
64 & 16 & 1.69 & 2.32 & 1.26 & 3.03\\
64 & 32 & 1.77 & 3.74 & 1.26 & 2.98\\
64 & 64 & 1.81 & 6.05 & 1.77 & 2.84\\
128 & 4 & 1.56 & 1.60 & 1.26 & 3.02\\
128 & 8 & 1.56 & 1.83 & 1.26 & 2.79\\
128 & 16 & 1.69 & 2.33 & 1.26 & 3.12\\
128 & 32 & 1.76 & 3.73 & 1.27 & 3.14\\
128 & 64 & 1.82 & 6.05 & 1.26 & 3.01\\
128 & 128 & 1.95 & 10.18 & 1.77 & 2.97\\
256 & 4 & 1.56 & 1.60 & 1.26 & 3.00\\
256 & 8 & 1.56 & 1.83 & 1.26 & 3.00\\
256 & 16 & 1.69 & 2.33 & 1.26 & 2.99\\
256 & 32 & 1.74 & 3.74 & 1.27 & 3.03\\
256 & 64 & 1.82 & 6.06 & 1.26 & 3.05\\
256 & 128 & 1.95 & 10.17 & 1.26 & 3.07\\
256 & 256 & 2.17 & 18.27 & 1.77 & 3.02\\
\end{array}
Then for bigger hashtables (no lists)
\begin{array}{l|l|l|l}
N & M & FASTHASH & HASH \\
\hline
2048 & 4 & 1.26 & 2.63\\
2048 & 8 & 1.26 & 2.80\\
2048 & 16 & 1.26 & 2.84\\
2048 & 32 & 1.27 & 2.79\\
2048 & 64 & 1.26 & 2.98\\
2048 & 128 & 1.26 & 3.22\\
2048 & 256 & 1.26 & 3.34\\
2048 & 512 & 1.26 & 3.43\\
2048 & 1024 & 1.26 & 3.36\\
2048 & 2048 & 1.78 & 3.18\\
4096 & 4 & 1.26 & 2.99\\
4096 & 8 & 1.26 & 2.72\\
4096 & 16 & 1.26 & 2.66\\
4096 & 32 & 1.26 & 3.09\\
4096 & 64 & 1.26 & 3.21\\
4096 & 128 & 1.26 & 3.20\\
4096 & 256 & 1.26 & 3.31\\
4096 & 512 & 1.26 & 3.37\\
4096 & 1024 & 1.26 & 3.30\\
4096 & 2048 & 1.26 & 3.34\\
4096 & 4096 & 1.79 & 3.20\\
8192 & 4 & 1.26 & 2.60\\
8192 & 8 & 1.26 & 2.75\\
8192 & 16 & 1.26 & 3.08\\
8192 & 32 & 1.26 & 3.09\\
8192 & 64 & 1.26 & 3.24\\
8192 & 128 & 1.26 & 3.30\\
8192 & 256 & 1.26 & 3.46\\
8192 & 512 & 1.26 & 3.42\\
8192 & 1024 & 1.26 & 3.49\\
8192 & 2048 & 1.26 & 3.47\\
8192 & 4096 & 1.26 & 3.46\\
8192 & 8192 & 1.79 & 3.28\\
16384 & 4 & 1.26 & 3.04\\
16384 & 8 & 1.26 & 3.04\\
16384 & 16 & 1.26 & 3.19\\
16384 & 32 & 1.27 & 3.13\\
16384 & 64 & 1.26 & 3.08\\
16384 & 128 & 1.26 & 3.28\\
16384 & 256 & 1.26 & 3.39\\
16384 & 512 & 1.26 & 3.46\\
16384 & 1024 & 1.26 & 3.43\\
16384 & 2048 & 1.25 & 3.48\\
16384 & 4096 & 1.26 & 3.66\\
16384 & 8192 & 1.26 & 3.73\\
16384 & 16384 & 1.82 & 3.59\\
32768 & 4 & 1.26 & 2.52\\
32768 & 8 & 1.25 & 3.03\\
32768 & 16 & 1.26 & 2.98\\
32768 & 32 & 1.25 & 2.98\\
32768 & 64 & 1.26 & 2.94\\
32768 & 128 & 1.26 & 3.07\\
32768 & 256 & 1.25 & 3.29\\
32768 & 512 & 1.26 & 3.51\\
32768 & 1024 & 1.26 & 3.51\\
32768 & 2048 & 1.26 & 3.57\\
32768 & 4096 & 1.27 & 3.78\\
32768 & 8192 & 1.26 & 4.09\\
32768 & 16384 & 1.26 & 4.25\\
32768 & 32768 & 1.84 & 4.03\\
65536 & 4 & 1.26 & 2.93\\
65536 & 8 & 1.26 & 2.94\\
65536 & 16 & 1.26 & 2.83\\
65536 & 32 & 1.26 & 3.02\\
65536 & 64 & 1.26 & 3.39\\
65536 & 128 & 1.26 & 3.46\\
65536 & 256 & 1.26 & 3.57\\
65536 & 512 & 1.26 & 3.61\\
65536 & 1024 & 1.26 & 3.57\\
65536 & 2048 & 1.26 & 3.68\\
65536 & 4096 & 1.26 & 4.20\\
65536 & 8192 & 1.26 & 4.83\\
65536 & 16384 & 1.28 & 4.91\\
65536 & 32768 & 1.27 & 5.02\\
65536 & 65536 & 1.96 & 4.86\\
131072 & 4 & 1.27 & 2.55\\
131072 & 8 & 1.26 & 2.73\\
131072 & 16 & 1.26 & 2.80\\
131072 & 32 & 1.26 & 3.13\\
131072 & 64 & 1.26 & 3.61\\
131072 & 128 & 1.26 & 3.40\\
131072 & 256 & 1.26 & 3.55\\
131072 & 512 & 1.26 & 3.62\\
131072 & 1024 & 1.26 & 3.56\\
131072 & 2048 & 1.26 & 3.60\\
131072 & 4096 & 1.26 & 4.29\\
131072 & 8192 & 1.27 & 4.65\\
131072 & 16384 & 1.28 & 5.00\\
131072 & 32768 & 1.27 & 5.90\\
131072 & 65536 & 1.30 & 7.27\\
131072 & 131072 & 2.14 & 6.67\\
\end{array}
The final example is showcasing the new hash and assoc's when we use plain old hashq. We do use VM operations for hashq hashing numbers and the core search function.
\begin{array}{l|l|l|l|l|l}
N & M & FASTLIST & ASSOC & FASTHASH & HASH \\
\hline
16 & 4 & 1.94 & 1.60 & 2.14 & 2.87\\
16 & 8 & 1.94 & 1.82 & 2.47 & 3.32\\
16 & 16 & 2.11 & 2.32 & 2.44 & 3.15\\
32 & 4 & 1.93 & 1.60 & 1.95 & 2.78\\
32 & 8 & 1.96 & 1.83 & 2.07 & 3.04\\
32 & 16 & 2.10 & 2.33 & 2.27 & 3.02\\
32 & 32 & 2.12 & 3.79 & 2.29 & 3.15\\
64 & 4 & 1.93 & 1.60 & 1.96 & 2.78\\
64 & 8 & 1.97 & 1.83 & 1.97 & 3.29\\
64 & 16 & 2.10 & 2.32 & 2.15 & 3.28\\
64 & 32 & 2.09 & 3.74 & 2.45 & 3.20\\
64 & 64 & 2.18 & 6.04 & 2.29 & 3.06\\
128 & 4 & 1.92 & 1.61 & 1.96 & 3.27\\
128 & 8 & 1.96 & 1.83 & 1.97 & 3.02\\
128 & 16 & 2.11 & 2.32 & 2.15 & 3.37\\
128 & 32 & 2.12 & 3.73 & 2.19 & 3.41\\
128 & 64 & 2.18 & 6.05 & 2.78 & 3.27\\
128 & 128 & 2.40 & 10.17 & 2.31 & 3.10\\
256 & 4 & 1.93 & 1.60 & 1.95 & 3.26\\
256 & 8 & 1.96 & 1.83 & 1.93 & 3.29\\
256 & 16 & 2.11 & 2.32 & 2.08 & 3.20\\
256 & 32 & 2.09 & 3.73 & 2.09 & 3.31\\
256 & 64 & 2.18 & 6.32 & 2.38 & 3.23\\
256 & 128 & 2.40 & 10.17 & 2.64 & 3.24\\
256 & 256 & 2.97 & 18.26 & 2.35 & 3.35\\
\end{array}
Then for bigger hashtables (no lists)
\begin{array}{l|l|l|l}
N & M & FASTHASH & HASH \\
\hline
2048 & 4 & 1.85 & 2.92\\
2048 & 8 & 1.86 & 3.10\\
2048 & 16 & 1.86 & 3.10\\
2048 & 32 & 1.86 & 3.05\\
2048 & 64 & 1.93 & 3.22\\
2048 & 128 & 2.06 & 3.44\\
2048 & 256 & 2.29 & 3.55\\
2048 & 512 & 2.71 & 3.57\\
2048 & 1024 & 2.90 & 3.58\\
2048 & 2048 & 2.57 & 3.39\\
4096 & 4 & 1.86 & 3.22\\
4096 & 8 & 1.85 & 3.02\\
4096 & 16 & 1.86 & 2.93\\
4096 & 32 & 1.86 & 3.37\\
4096 & 64 & 1.95 & 3.43\\
4096 & 128 & 2.22 & 3.44\\
4096 & 256 & 2.28 & 3.48\\
4096 & 512 & 2.44 & 3.60\\
4096 & 1024 & 2.87 & 3.57\\
4096 & 2048 & 2.85 & 3.55\\
4096 & 4096 & 2.58 & 3.42\\
8192 & 4 & 1.79 & 2.89\\
8192 & 8 & 1.80 & 2.95\\
8192 & 16 & 1.79 & 3.20\\
8192 & 32 & 1.77 & 3.29\\
8192 & 64 & 1.81 & 3.35\\
8192 & 128 & 1.83 & 3.52\\
8192 & 256 & 1.94 & 3.59\\
8192 & 512 & 2.05 & 3.55\\
8192 & 1024 & 2.26 & 3.66\\
8192 & 2048 & 2.67 & 3.63\\
8192 & 4096 & 2.97 & 3.53\\
8192 & 8192 & 2.60 & 3.47\\
16384 & 4 & 1.80 & 3.19\\
16384 & 8 & 1.80 & 3.24\\
16384 & 16 & 1.81 & 3.39\\
16384 & 32 & 1.79 & 3.28\\
16384 & 64 & 1.79 & 3.22\\
16384 & 128 & 1.84 & 3.42\\
16384 & 256 & 1.87 & 3.58\\
16384 & 512 & 1.94 & 3.61\\
16384 & 1024 & 2.06 & 3.52\\
16384 & 2048 & 2.27 & 3.59\\
16384 & 4096 & 2.74 & 3.71\\
16384 & 8192 & 3.14 & 4.00\\
16384 & 16384 & 2.95 & 3.83\\
32768 & 4 & 1.79 & 2.77\\
32768 & 8 & 1.80 & 3.14\\
32768 & 16 & 1.82 & 3.12\\
32768 & 32 & 1.79 & 3.11\\
32768 & 64 & 1.82 & 3.05\\
32768 & 128 & 1.80 & 3.15\\
32768 & 256 & 1.81 & 3.43\\
32768 & 512 & 1.89 & 3.61\\
32768 & 1024 & 1.96 & 3.59\\
32768 & 2048 & 2.08 & 3.62\\
32768 & 4096 & 2.38 & 4.04\\
32768 & 8192 & 2.99 & 4.64\\
32768 & 16384 & 3.58 & 4.48\\
32768 & 32768 & 3.46 & 4.21\\
65536 & 4 & 1.78 & 3.12\\
65536 & 8 & 1.79 & 3.11\\
65536 & 16 & 1.87 & 3.06\\
65536 & 32 & 1.87 & 3.19\\
65536 & 64 & 1.88 & 3.46\\
65536 & 128 & 1.88 & 3.55\\
65536 & 256 & 1.88 & 3.55\\
65536 & 512 & 1.90 & 3.73\\
65536 & 1024 & 1.97 & 3.67\\
65536 & 2048 & 2.04 & 3.79\\
65536 & 4096 & 2.19 & 4.20\\
65536 & 8192 & 2.47 & 4.41\\
65536 & 16384 & 3.03 & 4.64\\
65536 & 32768 & 3.89 & 5.09\\
65536 & 65536 & 4.34 & 4.58\\
131072 & 4 & 1.80 & 2.61\\
131072 & 8 & 1.80 & 3.00\\
131072 & 16 & 1.79 & 2.93\\
131072 & 32 & 1.81 & 3.30\\
131072 & 64 & 1.83 & 3.78\\
131072 & 128 & 1.84 & 3.55\\
131072 & 256 & 1.84 & 3.69\\
131072 & 512 & 1.86 & 3.72\\
131072 & 1024 & 1.88 & 3.72\\
131072 & 2048 & 1.90 & 3.65\\
131072 & 4096 & 2.03 & 4.71\\
131072 & 8192 & 2.62 & 6.23\\
131072 & 16384 & 3.73 & 7.88\\
131072 & 32768 & 5.62 & 9.10\\
131072 & 65536 & 7.82 & 8.61\\
131072 & 131072 & 13.24 & 9.40\\
\end{array}