[LLVMdev] RFC: Using hashing for switch statements
anton at korobeynikov.info
Fri Jan 24 12:41:54 PST 2014
As it was stated before - all these are pretty fine theoretical
arguments. However, without real numbers of benchmarking it's hard to
say anything definite about whether this approach is preferable.
There are plenty of switches in LLVM and clang itself. Can't you
provide some statistics how the stuff was lowered before and after?
On Fri, Jan 24, 2014 at 11:51 PM, Jasper Neumann <jn at sirrida.de> wrote:
> Hi folks,
> here is a short RFC regarding hashing for the implementation of switch
> statements and my preliminary patch.
> I posted this patch on 2014-01-16 21:58 at llvm-commits at cs.uiuc.edu. You can
> find a copy e.g. on
> Best regards
> Preliminary: Special identifiers
> n number of given switch labels
> N size of the (intermediate) hash range, usually n rounded up to the next
> power of 2
> x the value to be switch upon
> Why (perfect) hashing?
> Hashing enables a jump table like approach even for sparse label sets.
> Let's concentrate on perfect hashing in order to keep the code smaller and
> Hashing is faster than a decision tree provided there are enough labels,
> i.e. O(1) vs. O(log n). For x86 the break-even point is about 6..8 labels
> assuming random values of x.
> The memory consumption of the hashing tables is comparable to the memory
> footage of the code of a decision tree.
> A decision tree consumes a lot of branch prediction resources.
> For minimal perfect hashing the jump table does not need to contain dummy
> labels and therefore has n instead of (almost) N elements.
> "Switch-to-lookup tables": If all jump targets are sufficiently similar
> hashing opens the possibility to use table which contain the differences.
> Thereby the indirect jump is eliminated and we get a huge performance gain.
> Why not hashing?
> A decision tree or even an else-if chain is faster than hashing for very few
> The branch prediction logic usually does not work as satisfactorily as for
> decision trees if fed with cyclical access patterns; i.e. the break-even
> point in this scenario is higher.
> Hashing can not deal with large label ranges since very single label is
> treated. A possible compromise/solution might be to first catch all single
> labels and sufficiently short ranges by hashing and then treat the remaining
> ranges in a decision tree.
> Why reversible hashing?
> Reversible hashing does not need a value table and a simple range check of
> the calculated hash value is sufficient.
> Reversible hashing is very simple to implement.
> The usual jump table approach is a special case of reversible hashing.
> Why simple hashing?
> A comparison against a value table is sufficient.
> Simple hashing does not need an extra table.
> Why not simple hashing?
> Simple hashing needs a lot of hash function candidates to be tested until a
> suitable is found; this costs time in the compiler.
> For large label sets the chance to find such a function is very small; the
> applicability is therefore reduced to quite small sets.
> Minimal perfect hashing is usually not achievable.
> Why Jenkins' hashing?
> Jenkins' hashing produces very fast and compact code and needs one extra
> table (BTable) with usually less elements than n; the elements itself are
> unsigned integers which can become as high N-1.
> The code first generates two hash values h1=f1(x) and h2=f2(x) where f1 and
> f2 limit their range by shifting or masking, then the final hash as h = h1 ^
> Minimal perfect hashing is achievable, albeit often at the cost of larger
> For sufficiently high N the table memory can optionally be reduced by
> introducing an extra scrambling table (STable); BTable then consists of
> bytes whereas STable contains 256 words or DWords. Obviously this reduction
> of memory costs another indirection in the resulting code.
> As far as I tested Jenkins' hashing can be used for label sets of several
> thousand elements; often even one million labels can be treated.
> The application in a compiler is to lower switch statements. Switch
> statements with much more than 1000 labels are almost never present in
> Jenkins' hashing usually is sufficiently fast.
> By the way: You will find the following remark in lib/Target/README.txt:
> Investigate lowering of sparse switch statements into perfect hash tables:
> Why not Jenkins' hashing?
> The search time of Jenkins' hashing might become unacceptably long for
> extraordinary large label sets.
> In practice this should never happen, however in this case the label set
> will be split in two parts are the process of lowering switches will be
> An alternative (for very large label sets) could be a different perfect hash
> algorithm such as CHM.
> Why CHM?
> CHM is extraordinarily fast at generating the extra table with a runtime of
> CHM can be used to hash almost arbitrarily many labels.
> CHM produces an ordered minimal perfect hashing function.
> Why not CHM?
> CHM needs extra table space of 2.09 * n elements; this is larger compared to
> Jenkins' method.
> CHM (in the implementation of cmph-2.0) is much more complicated than the
> code of Jenkins' hashing:
> h1 = f1(x) % N;
> h2 = f2(x) % N;
> if (h1 == h2 && ++h2 >= N) h2 = 0;
> h = (Table[h1] + Table[h2]) % n;
> By enlarging N to a power of 2 and other tweaking we might replace the 3
> modulo operations by shifting or masking at the cost of more table space. We
> might also eliminate the conditional statement if in this case we choose
> different hash functions and retry the calculation.
> There is at least one more table access compared to Jenkins' method.
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
With best regards, Anton Korobeynikov
Faculty of Mathematics and Mechanics, Saint Petersburg State University
More information about the llvm-dev