[llvm-commits] [LLVM, loop-unswitch, bugfix for #11429] Wrong behaviour for switches.

Mon Dec 12 05:13:55 PST 2011

Ping.

-Stepan.

Stepan Dyatkovskiy wrote:
> Ping.
>
> -Stepan.
>
> Stepan Dyatkovskiy wrote:
>> I fixed code heuristics. How it works now:
>>
>> 1. Calculate average number of produced instructions and basics blocks:
>> number-of-instructions=current-loop->number-of-instructions *
>> unswitched-number
>> number-of-bb=current-loop->number-of-bb * unswitched-number
>>
>> 2. If number-of-instructions>  Threshold || number-of-bb*5>  Threshold,
>> stop unswitching.
>>
>> By default Threshold is 50. But user can set custom threshold using
>> -loop-unswitch-threshold<custom-threshold>  option. This option existed
>> before my patch, and I kept it without changes though.
>>
>> I compiled ffmpeg (ffmpeg.org) with llvm + clang toolchain (with and
>> without my patch). I got the next results:
>>
>> Without patch: ffmpeg size is 7369612 bytes.
>> Threshold = 50: ffmpeg size is 7349132 bytes (less then in ToT version).
>> Threshold = 200: ffmpeg size is 7439244 bytes.
>>
>> I also checked the ffmpeg build time in all cases it was 2 mins, 35 secs.
>> Transcoding time with ffmpeg (mp3 ->  mp2) in all cases was also the
>> same: process took 1 min, 19 secs.
>>
>> I added unit test that checks unswitch kicking in case of insufficient
>> size. Unit tests and fixed patch (default threshold is 50) are attached
>> to this post.
>>
>> Thanks.
>> -Stepan.
>>
>> Dan Gohman wrote:
>>>
>>> On Dec 7, 2011, at 5:54 AM, Stepan Dyatkovskiy wrote:
>>>
>>>> Dan Gohman wrote:
>>>>
>>>>> This sounds good. How sure are you that the existing size heuristics
>>>>> are actually kicking in for the new unrollings? I don't have a reason
>>>>> to suspect a bug, other than that you're asking code to work in cases
>>>>> that it hasn't before.
>>>>
>>>> The main reason to suspect a bug is that the pass doesn't work as it
>>>> was planned initially. I made this conclusion after reading the code
>>>> and its comments. Especially this ones (LoopUnswitch.cpp, string #277):
>>>> // FIXME: this should choose the most expensive case!
>>>> // FIXME: scan for a case with a non-critical edge?
>>>> Programmer placed the stub by now:
>>>> Constant *UnswitchVal = SI->getCaseValue(1);
>>>>
>>>> Probably he missed one thing. Since we selected 1-st case always, all
>>>> other cases will never unswitched. But there is no comments about it.
>>>> This "feature" looks unplanned. If not - ok, it is not a bug then. I
>>>> proposed to insert additional comments though.
>>>>
>>>> Size heuristics currently analyses the size of loop to be unswitched.
>>>> If number of instructions, or number of BBs exceeds some threshold,
>>>> then the loop will skipped. We can also control the number cases to
>>>> be unswitched.
>>>
>>> Oh, I don't doubt it may a bug. And there's no master plan behind
>>> non-trivial loop unswitching for switches. My point was just that
>>> it's been the way it is for a long time, and possibly other things
>>> have become accustomed to it, or accidentally reliant on it,
>>> working that way.
>>>
>>>>
>>>>> Each unswitched case has the potential to increase performance,
>>>>> if conditions are favorable. But if code size is increased, there's
>>>>> also the possibility of decreased performance.
>>>>
>>>> Before unswitchment, loop executes switch N times. The switch is
>>>> lowered to several "icmp" checks (let it be "n"). Depending on
>>>> lowering algorithm selected the "n" is either linear or logarithmic
>>>> function on cases number ( n=k*num_cases or n=k*log(num_cases) ).
>>>> Summary we have N*n "icmp" checks.
>>>>
>>>> After unswitchment all checks will moved out of loop. And we got
>>>> 1*n_us "icmp" checks in this case. But the "n_us" is always linear
>>>> after unswitchment. So if (1*n_us<  N*n) then performance will
>>>> increased. Else - it will decreased.
>>>
>>> On most architectures, those checks will usually be predictable -- the
>>> conditions are loop-invariant after all -- and thus fairly cheap,
>>> depending
>>> on what else is in the loop. But this isn't important here.
>>>
>>>>
>>>>> If you can confirm
>>>>> that that's working as expected, it should be fine.
>>>>
>>>> Did you mean the regression tests? If yes - they are attached in
>>>> initial post. These tests checks that all working as expected for
>>>> loop with single switch instruction and for loop with two switches.
>>>>
>>>> In my patch I fixed the stub described above. I also can improve the
>>>> size heuristics if you need.
>>>
>>>
>>> I just meant that it would be good to have testcases in which the code
>>> size heuristic actually has to kick in and prevent it from unswitching
>>> out of control.
>>>
>>> Also, if you could do a rough check of code size on some general
>>> real-world
>>> application code, that would be good too.
>>>
>>> With these precautions, I think the patch is fine.
>>>
>>> Dan
>>>
>>
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits