[LLVMdev] SimplifyCFG vs loops
Andrew Clinton
andrew at sidefx.com
Thu Oct 18 11:43:32 PDT 2012
On 10/18/2012 01:26 PM, David Blaikie wrote:
> On Thu, Oct 18, 2012 at 10:12 AM, Andrew Clinton<andrew at sidefx.com> wrote:
>> I actually submitted a patch to fix this very problem quite a while back,
>> but I don't think it was ever added to the baseline:
>>
>> http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20110711/124136.html
>>
>> I have also explained why it is important to avoid the creation of new loops
>> in an optimizer that performs analysis on the looping structure.
>>
>> I now need to keep this patch updated whenever we upgrade LLVM, to ensure
>> that our optimizer (which is now in production) works correctly. Would it
>> be possible to add this patch to the head? I have a version that works with
>> 3.0 but have not yet updated to 3.1 or the head.
> I'm not quite up for diving all the way into the previous thread, but
> do you have a test case that demonstrates a missed optimization due to
> this transformational problem? That should hopefully be enough to
> convince people the change is valuable. (I assume this is causing you
> real problems/missed optimizations or you wouldn't bother keeping it
> for your out of tree releases)
>
I think part of the difficulty in explaining the benefit of this change
is that the optimization pass where this change is of value to us is
also a custom analysis pass. Basically, we're using LLVM as an
optimization framework for a SIMD language, and the difficulty with the
unwanted loop transformation comes when we have code like this:
while (uniform_and_expensive_condition_with_side_effects())
{
if (cheap_varying_condition())
do_this
}
If this gets transformed to something like this (which is similar to the
example I posted in the earlier thread):
while () // outer loop
{
while () // inner loop
{
if (!uniform_and_expensive_condition_with_side_effects())
return;
if (cheap_varying_condition())
break;
}
do_this
}
If you think of "uniform_and_expensive_condition_with_side_effects()" as
a function that is efficient only when it is run in lockstep for the
SIMD array, and "cheap_varying_condition()" as a function that can be
run efficiently with conditional masking, the transformation is
detrimental for performance because
"uniform_and_expensive_condition_with_side_effects()" now needs to both
work with conditional masking and have an implementation that does this
efficiently. In practice, we have some functions that don't work at all
unless they are executed "loop uniformly" as in the source loop, so
changing the CFG in this way breaks the optimized code.
I know this explanation may be a little difficult to follow, especially
if you are thinking of LLVM entirely in the context of optimizing serial
programs for scalar architectures. So I have offered these other
reasons to add the patch:
1) Conditional logic is naturally simpler than loops, so a pass designed
to simplify the loop structure should not introduce more complexity in
the form of additional loops when the CFG is just representing control flow
2) The patch doesn't degrade the assembly, at least in the most recent
test that I performed for Cameron I think it was exactly the same
instruction count or slightly better
Andrew
More information about the llvm-dev
mailing list