[llvm-commits] [LLVM, loop-unswitch, bugfix for #11429] Wrong behaviour for switches.

Thu Dec 8 11:57:31 PST 2011

On Dec 7, 2011, at 5:54 AM, Stepan Dyatkovskiy wrote:

> Dan Gohman wrote:
> 
>> This sounds good. How sure are you that the existing size heuristics
>> are actually kicking in for the new unrollings? I don't have a reason
>> to suspect a bug, other than that you're asking code to work in cases
>> that it hasn't before.
> 
> The main reason to suspect a bug is that the pass doesn't work as it was planned initially. I made this conclusion after reading the code and its comments. Especially this ones (LoopUnswitch.cpp, string #277):
>        // FIXME: this should choose the most expensive case!
>        // FIXME: scan for a case with a non-critical edge?
> Programmer placed the stub by now:
> 	Constant *UnswitchVal = SI->getCaseValue(1);
> 
> Probably he missed one thing. Since we selected 1-st case always, all other cases will never unswitched. But there is no comments about it. This "feature" looks unplanned. If not - ok, it is not a bug then. I proposed to insert additional comments though.
> 
> Size heuristics currently analyses the size of loop to be unswitched. If number of instructions, or number of BBs exceeds some threshold, then the loop will skipped. We can also control the number cases to be unswitched.

Oh, I don't doubt it may a bug. And there's no master plan behind
non-trivial loop unswitching for switches. My point was just that
it's been the way it is for a long time, and possibly other things
have become accustomed to it, or accidentally reliant on it,
working that way.

> 
>> Each unswitched case has the potential to increase performance,
>> if conditions are favorable. But if code size is increased, there's
>> also the possibility of decreased performance.
> 
> Before unswitchment, loop executes switch N times. The switch is lowered to several "icmp" checks (let it be "n"). Depending on lowering algorithm selected the "n" is either linear or logarithmic function on cases number ( n=k*num_cases or n=k*log(num_cases) ). Summary we have N*n "icmp" checks.
> 
> After unswitchment all checks will moved out of loop. And we got 1*n_us "icmp" checks in this case. But the "n_us" is always linear after unswitchment. So if (1*n_us < N*n) then performance will increased. Else - it will decreased.

On most architectures, those checks will usually be predictable -- the
conditions are loop-invariant after all -- and thus fairly cheap, depending
on what else is in the loop. But this isn't important here.

> 
> > If you can confirm
>> that that's working as expected, it should be fine.
> 
> Did you mean the regression tests? If yes - they are attached in initial post. These tests checks that all working as expected for loop with single switch instruction and for loop with two switches.
> 
> In my patch I fixed the stub described above. I also can improve the size heuristics if you need.

I just meant that it would be good to have testcases in which the code
size heuristic actually has to kick in and prevent it from unswitching
out of control.

Also, if you could do a rough check of code size on some general real-world
application code, that would be good too.

With these precautions, I think the patch is fine.

Dan