[PATCH] D105723: [LSR] Do not hoist IV if it is not post increment case. PR43678

Serguei Katkov via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Jul 13 20:39:55 PDT 2021


skatkov added a comment.

In D105723#2874536 <https://reviews.llvm.org/D105723#2874536>, @qcolombet wrote:

> Hi,
>
> I am not sure I understand the issue here.
> Could you paste the IR with the wrong transformation (before this patch) for one of the test (e.g., the smaller one)?

Hi Quentin, first of all, thank you very much for starting looking into that. I'll update tests soon.

Here is the buggy output for the first case for the current state of LSR:

  *** IR Dump After Loop Strength Reduction (loop-reduce) ***
  ; Preheader:
  bb:
    %tmp = bitcast i8* null to i32*
    %tmp1 = load i32, i32* %tmp, align 4
    %tmp2 = bitcast i8* null to i32*
    %tmp3 = load i32, i32* %tmp2, align 4
    br label %bb6
  
  ; Loop:
  bb6:                                              ; preds = %bb12, %bb
    %lsr.iv = phi i64 [ %lsr.iv.next, %bb12 ], [ -1, %bb ]
    %tmp8 = phi i32 [ %tmp16, %bb12 ], [ %tmp3, %bb ]
    %lsr.iv.next = add nsw i64 %lsr.iv, 1
    %tmp14 = add i32 %tmp8, %tmp1
    %tmp16 = add i32 %0, 1
    %tmp10 = icmp ult i64 %lsr.iv.next, 1048576
    br i1 %tmp10, label %bb12, label %bb11
  
  bb12:                                             ; preds = %bb6
    %0 = add i32 %tmp1, %tmp8
    %tmp15 = select i1 false, i32 %tmp14, i32 %tmp8
    %tmp17 = fcmp olt double 0.000000e+00, 2.270000e+02
    br i1 %tmp17, label %bb6, label %bb4
  
  ; Exit blocks
  bb11:                                             ; preds = %bb6
    unreachable
  
  bb4:                                              ; preds = %bb12
    %tmp5 = sext i32 %tmp16 to i64
    unreachable
  Instruction does not dominate all uses!
    %0 = add i32 %tmp1, %tmp8
    %tmp16 = add i32 %0, 1
  in function test

The critical pieces:
%tmp7 is post-increment induction variable.
%tmp8 is another IV which increment value %tmp16 LSR wants to update.

Due to the bug %tmp16 is hoisted to header due to there is a post-increment %tmp7 and we try to hoist all IV before condition of that IV.
Generated instruction %0 is still generated in backedge due to insert position for formula know that %tmp8 is not post-increment IV and we do not need to hoist it to header.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105723/new/

https://reviews.llvm.org/D105723



More information about the llvm-commits mailing list