Lowering switch statements with hashing
Jasper Neumann
jn at sirrida.de
Thu Jan 16 14:07:23 PST 2014
Hello Anton, hello all!
> Will you please provide RFC outlining the algorithm itself and
> possible some benchmarks as the .txt definitely does not contain
> enough details...
Well, we (a friend of mine and myself) set up a paper as mentioned in
hash_llvm.txt which is downloadable at
http://programming.sirrida.de/hashsuper.pdf and describes the simple*
variants. These should produce code like this:
imull magic, %edi, %ecx
shrl $27, %ecx
cmpl ValTable(,%rcx,4), %edi
jne default
jmpq JumpTable(,%rcx,8)
The first two lines in this example are the hash function; a value
comparison and an indirect jump follows.
The Jenkin's methods which will usually be used for bigger label sets
(about 24 or more labels) produce 2 hash values a and b and the final
hash function is evaluated as h(a,b) = a ^ BTable[b]. For very large
BTable's an additional scramble table is applied to save some space; the
threshold can be adapted.
> I'm a bit concerned about perfect hashing since it usually (classic
> implementations by Jenkins or MPH) involves two loads with the second
> load from the location computed by a first load and thus this in many
> cases yields two cache misses in a row.
You will find some artificial benchmarks in the mentioned paper.
For periodic patterns the decision tree might give better times than a
jump table approach, however I don't know how to simulate this.
A decision tree also produces a lot of cache misses because of the large
code involved.
At least a jump table uses a lot less branch prediction table entries
which can be used otherwise.
Real world test can be done now since a real implementation is available
with my patch. Also, I have provided some parameters which influence the
code generation such as the selection of the used hash algorithms; if
needed, I can easily provide others as well.
Best regards
Jasper
More information about the llvm-commits
mailing list