[llvm-commits] [LLVM, loop-unswitch, bugfix for #11429] Wrong behaviour for switches.

Sun Dec 11 08:23:26 PST 2011

Ping.

-Stepan.

Stepan Dyatkovskiy wrote:
> I fixed code heuristics. How it works now:
>
> 1. Calculate average number of produced instructions and basics blocks:
> number-of-instructions=current-loop->number-of-instructions *
> unswitched-number
> number-of-bb=current-loop->number-of-bb * unswitched-number
>
> 2. If number-of-instructions > Threshold || number-of-bb*5 > Threshold,
> stop unswitching.
>
> By default Threshold is 50. But user can set custom threshold using
> -loop-unswitch-threshold <custom-threshold> option. This option existed
> before my patch, and I kept it without changes though.
>
> I compiled ffmpeg (ffmpeg.org) with llvm + clang toolchain (with and
> without my patch). I got the next results:
>
> Without patch: ffmpeg size is 7369612 bytes.
> Threshold = 50: ffmpeg size is 7349132 bytes (less then in ToT version).
> Threshold = 200: ffmpeg size is 7439244 bytes.
>
> I also checked the ffmpeg build time in all cases it was 2 mins, 35 secs.
> Transcoding time with ffmpeg (mp3 -> mp2) in all cases was also the
> same: process took 1 min, 19 secs.
>
> I added unit test that checks unswitch kicking in case of insufficient
> size. Unit tests and fixed patch (default threshold is 50) are attached
> to this post.
>
> Thanks.
> -Stepan.
>
> Dan Gohman wrote:
>>
>> On Dec 7, 2011, at 5:54 AM, Stepan Dyatkovskiy wrote:
>>
>>> Dan Gohman wrote:
>>>
>>>> This sounds good. How sure are you that the existing size heuristics
>>>> are actually kicking in for the new unrollings? I don't have a reason
>>>> to suspect a bug, other than that you're asking code to work in cases
>>>> that it hasn't before.
>>>
>>> The main reason to suspect a bug is that the pass doesn't work as it
>>> was planned initially. I made this conclusion after reading the code
>>> and its comments. Especially this ones (LoopUnswitch.cpp, string #277):
>>> // FIXME: this should choose the most expensive case!
>>> // FIXME: scan for a case with a non-critical edge?
>>> Programmer placed the stub by now:
>>> Constant *UnswitchVal = SI->getCaseValue(1);
>>>
>>> Probably he missed one thing. Since we selected 1-st case always, all
>>> other cases will never unswitched. But there is no comments about it.
>>> This "feature" looks unplanned. If not - ok, it is not a bug then. I
>>> proposed to insert additional comments though.
>>>
>>> Size heuristics currently analyses the size of loop to be unswitched.
>>> If number of instructions, or number of BBs exceeds some threshold,
>>> then the loop will skipped. We can also control the number cases to
>>> be unswitched.
>>
>> Oh, I don't doubt it may a bug. And there's no master plan behind
>> non-trivial loop unswitching for switches. My point was just that
>> it's been the way it is for a long time, and possibly other things
>> have become accustomed to it, or accidentally reliant on it,
>> working that way.
>>
>>>
>>>> Each unswitched case has the potential to increase performance,
>>>> if conditions are favorable. But if code size is increased, there's
>>>> also the possibility of decreased performance.
>>>
>>> Before unswitchment, loop executes switch N times. The switch is
>>> lowered to several "icmp" checks (let it be "n"). Depending on
>>> lowering algorithm selected the "n" is either linear or logarithmic
>>> function on cases number ( n=k*num_cases or n=k*log(num_cases) ).
>>> Summary we have N*n "icmp" checks.
>>>
>>> After unswitchment all checks will moved out of loop. And we got
>>> 1*n_us "icmp" checks in this case. But the "n_us" is always linear
>>> after unswitchment. So if (1*n_us< N*n) then performance will
>>> increased. Else - it will decreased.
>>
>> On most architectures, those checks will usually be predictable -- the
>> conditions are loop-invariant after all -- and thus fairly cheap,
>> depending
>> on what else is in the loop. But this isn't important here.
>>
>>>
>>>> If you can confirm
>>>> that that's working as expected, it should be fine.
>>>
>>> Did you mean the regression tests? If yes - they are attached in
>>> initial post. These tests checks that all working as expected for
>>> loop with single switch instruction and for loop with two switches.
>>>
>>> In my patch I fixed the stub described above. I also can improve the
>>> size heuristics if you need.
>>
>>
>> I just meant that it would be good to have testcases in which the code
>> size heuristic actually has to kick in and prevent it from unswitching
>> out of control.
>>
>> Also, if you could do a rough check of code size on some general
>> real-world
>> application code, that would be good too.
>>
>> With these precautions, I think the patch is fine.
>>
>> Dan
>>
>
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits