Switch containing holes via table lookup

Jasper Neumann jn at sirrida.de
Wed Feb 12 15:40:15 PST 2014


Hello Hans!

 > Sorry for the slow reply, I'm way behind on my email :/

Well, I waited some time and then decided that the llvm-commits forum 
might not be the right place to discuss patches such as the two I posted 
regarding the related theme "switch via perfect hashing" where I got 
only few response. Therefore I posted a similar message to the developer 
forum (I assume that I should not cross-post)...


 >> There is code which converts a switch statement to a table lookup
 >> but has problems when there are holes in the cases and the default
 >> case can not be served with the table.

 >> My first attempt to fix this almost but unfortunately not always
 >> works.
 >> The affected file is /lib/Transforms/Utils/SimplifyCFG.cpp .
 >> This is done by additionally testing a small set.
 >>
 >> As an example the function
 >> ==>
 >> unsigned test(unsigned x) {
 >>    switch (x) {
 >>      case 100: return 0;
 >>      case 101: return 1;
 >>      case 103: return 2;
 >>      case 105: return 3;
 >>      case 107: return 4;
 >>      case 109: return 5;
 >>      case 110: return 6;
 >>      default: return x*3;
 >>      }
 >>    }

 > Actually, this case could be handled by the table, because "x*3" is
 > constant in each "hole". This is something I've been meaning to fix :)

Well, yes, but this was not what I wanted to show.
Filling up such holes will be fortunate, nevertheless.

 > But your patch is of course still relevant when the default case truly
 > doesn't yield a constant. Have you run any benchmarks to see how this
 > compares to the jump table we'd get otherwise?

Yes, I have done this for a IMHO comparable situation. You may look in 
our paper which I put on my site at 
http://programming.sirrida.de/hashsuper.pdf and look at the diagram on 
page 12. You see that the two perfect hashing variants only need about 4 
cycles compared to a jump table with about 33 cycles. It seems that a 
non-predictable jump costs about 30 cycles. For other processors I got 
similar results.


 > One crazy idea for avoiding to build the bitmask ourselves would be to
 > implement the "hole check" with a switch and then run
 > SwitchToLookupTable on it. That will build a bitmask if it fits in a
 > register or use an array otherwise. If this makes the code less
 > complex, it might be worth a try. What do you think?

Hmm. The code which sets up the mask is only few lines. As long as a 
such a small mask is sufficient this approach is hard to beat. For 
bigger masks it will probably not be much more work. The x86's BT 
instruction can easily access memory. For other processors some masking 
and shifting will be required; this penalty should be dealt with.

My idea is to additionally extend the applicability of switch via table 
lookup by using perfect hashing in order to be able to also handle even 
very sparse switches. The two conditional jumps (range and hole check) 
will then be replaced by some scrambling, (usually) one table lookup 
plus one conditional jump (value check).


But, as I wrote, my first attempt almost but unfortunately not always 
works. When I try to compile LLVM with modified compiler it crashes at 
lib/AsmParser/LLParser.cpp after successfully converting several 
switches to table lookups from other files.
The error is as follows:
clang: /home/jane/wrk/llvm-trunk/include/llvm/IR/Instructions.h:2155: 
llvm::Value* llvm::PHINode::getIncomingValueForBlock(const 
llvm::BasicBlock*) const: Assertion `Idx >= 0 && "Invalid basic block 
argument!"' failed.
[...]
The statistics I got thus far tells me that those fixable holes occur 
quite often.
I probably have missed to insert a connection between BasicBlocks.
There should be a way to apply the methods described in 
http://llvm.cs.uiuc.edu/releases/2.3/docs/tutorial/LangImpl5.html.
Please help to fix this.

Best regards
Jasper



More information about the llvm-commits mailing list