Lowering switch statements with hashing

Joerg Sonnenberger joerg at britannica.bec.de
Thu Jan 16 14:24:03 PST 2014


On Thu, Jan 16, 2014 at 09:58:41PM +0100, Jasper Neumann wrote:
> To test the hashing library hashlib there is hashtest.cpp.

Having worked on the topic quite a bit for NetBSD, I am not sure your
choice of algorithm is optimal. As I see it, there are two sane
choices:

(1) Just apply a randomised hash function to reduce the expected chain
lengths to O(n/m), where m is the desired number of buckets. This is not
perfect hashing, but avoids most of the search. It has the huge
advantage of not needing side storage beside the jump table.

(2) Use the CHM algorithm to produce an order preserving hash function.
If the backend supports two relocations on the same address (one
positive, one negative) OR the jump table can be expressed as relative
expressions, this has the nice advantage of requiring only one
additional memory access.

As CHM is probalistic linear time, the only practical problem is the
choice of hash functions. For integers, the upper half of a 32bit/64bit
multiplication is a good universal hash function.

I would strongly advocate *against* using the Jenkin's construction
here.

Joerg



More information about the llvm-commits mailing list