[LLVMdev] Rotated loop identification

Michele Scandale michele.scandale at gmail.com
Thu Feb 7 08:40:45 PST 2013


Thanks for your reply.

Maybe it wasn't so clear, but the optimization I'm writing is target-dependent
and so it's declared inside the target backend and is run after the independent
optimizer. So I cannot run my pass just after LoopRotate and before InstCombine.
I can still use ScalarEvolution, but the instruction combiner can in some cases
simplify the entry guard condition.

A small example is the following:

/////////////////////////////////////
extern void x(int);

int foo_int(int a, int b) {
  int i;
  for (i = 0; i < b/2; ++i) {
    x(i);
  }

  return 0;
}
/////////////////////////////////////

the correspondent optimized IR is:

///////////////////////////////////////////////////////////////////////////////
define i32 @foo_int(i32 %a, i32 %b) nounwind uwtable {
entry:
  %div = sdiv i32 %b, 2
  %cmp3 = icmp sgt i32 %b, 1
  br i1 %cmp3, label %for.body, label %for.end

for.body:                                         ; preds = %entry, %for.body
  %i.04 = phi i32 [ %inc, %for.body ], [ 0, %entry ]
  tail call void @x(i32 %i.04) nounwind
  %inc = add nsw i32 %i.04, 1
  %cmp = icmp slt i32 %inc, %div
  br i1 %cmp, label %for.body, label %for.end

for.end:                                          ; preds = %for.body, %entry
  ret i32 0
}
///////////////////////////////////////////////////////////////////////////////

>From the IR is possible to see that the condition in the entry block is

  %cmp3 = icmp sgt i32 %b, 1

but starting from the loop IV it should be expected something like

  %cmp3 = icmp slt i32 0, %div

and this is verified looking at the instcombine debug trace:
////////////////////////////////////////////////
IC: Visiting:   %cmp3 = icmp slt i32 0, %div
IC: Old =   %cmp3 = icmp sgt i32 %div, 0
    New =   <badref> = icmp sge i32 %b, 2
IC: ADD:   %cmp3 = icmp sge i32 %b, 2
IC: ERASE   %0 = icmp sgt i32 %div, 0
IC: ADD:   %div = sdiv i32 %b, 2
IC: Visiting:   %div = sdiv i32 %b, 2
IC: Visiting:   %cmp3 = icmp sge i32 %b, 2
IC: Old =   %cmp3 = icmp sge i32 %b, 2
    New =   <badref> = icmp sgt i32 %b, 1
IC: ADD:   %cmp3 = icmp sgt i32 %b, 1
IC: ERASE   %0 = icmp sge i32 %b, 2
////////////////////////////////////////////////

The fact that a loop has been rotated it means that it has an entry guard that
allows in the case of countable loop to optimize the trip count expression: it's
an information that can be lost during optimization and maybe not easy to derive
with an analysis. The usage of "IR extra information" would help in this case.
Using intrinsics with declared side-effects as information handlers I don't know
if it adds constraints to the optimizer. Indeed declaring no side-effects won't
preserve the information.

How can all this be solved?

Thanks again.

Best regards,
Michele Scandale


On 02/07/2013 12:07 AM, Andrew Trick wrote:
> 
> On Feb 4, 2013, at 10:48 AM, Michele Scandale <michele.scandale at gmail.com> wrote:
> 
>> Dear all,
>>
>> I'm working on a late IR target dependent optimization on loops. A part of this
>> optimization requires to derive "by hand" the trip-count expression of a given
>> loop. In order to handle correctly these cases I need to check if the loop has
>> an entry guard or not. The problem I have is that starting from the information
>> I derive during my analysis (initial IV value, last IV value, comparison kind) I
>> don't know how to verify if the entry guard is present or not, mainly because
>> algebraic simplifications changed the entry guard comparison. Being able to know
>> if a given loop has been rotated would be enough for my purposes but I haven't
>> found a way to obtain this.
>>
>> I think that the only way is to track in the IR the information, but I don't
>> know how to track it in a safe (the information shouldn't be lost) and
>> transparent way (it should not interfere with the optimizer). Thinking to custom
>> intrinsics I don't understand how to allow memory optimizations and, at the same
>> time, prevent the intrinsic elimination or hoisting from the target loop.
>>
>> I would appreciate any hint on this topic.
> 
> See ScalarEvolution::isLoopEntryGuardedByCond. It's part of the analysis that we use to determine trip count and benefits from rotated loops.
> 
> LLVM never annotates the IR with a history of transformations. You could potentially annotate the LoopInfo analysis, but it will only be preserved within the same LoopPassManager (you would have to run right after LoopRotate).
> 
> The only robust solution is ScalarEvolution.
> 
> -Andy
> 



More information about the llvm-dev mailing list