[PATCH] D60935: [IndVarSimplify] Fixup nowrap flags during LFTR when moving to post-inc (PR31181)

Nikita Popov via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed May 15 01:08:13 PDT 2019


nikic added a comment.

In D60935#1502223 <https://reviews.llvm.org/D60935#1502223>, @reames wrote:

> Reading through the discussion, it sounds like the only place where LFTR goes wrong is dealing with the post-increment form of the IV right?
>
> If so, what about splitting LFTR into two parts.  The first part does LFTR *without changing pre-to-post or vice versa*, and then a second part which converts to post-increment where possible.  The value of the later seems quite questionable to me honestly, and if we have to restrict it greatly it wouldn't matter as much as having to restrict the first part.    I'd also really like to be able to extend the first part to multiple exit loops whereas extending the second part seems actively harmful on some examples.


Unfortunately this is not the case. Based on the additional discussion in D30446 <https://reviews.llvm.org/D30446> the problem is not just the switch from pre- to post-increment. In addition to that LFTR may switch to a (more canonical) IV that is dynamically dead and as such may be poison. The example in https://reviews.llvm.org/D30446#702870 illustrates such a case. I've checked that this patch also fixes that case (i.e. change the add nsw into an add nuw).

One case this patch currently doesn't address is the following variation:

  define i32 @test(i32* %ptr, i1 %always_false) {
  entry:
    br label %for.cond
  
  for.cond:
    %iv = phi i32 [ -2147483648, %entry ], [ %iv.inc, %always_taken ]
    %iv2 = phi i32 [ 0, %entry ], [ %iv2.inc, %always_taken ]
    store i32 %iv, i32* %ptr
    %cmp = icmp slt i32 %iv, 20
    br i1 %cmp, label %for.body, label %for.end
  
  for.body:
    br i1 %always_false, label %never_taken, label %always_taken
  
  never_taken:
    store volatile i32 %iv2, i32* %ptr
    br label %always_taken
  
  always_taken:
    %iv.inc = add nsw i32 %iv, 1
    %iv2.inc = add nsw i32 %iv2, 1
    br label %for.cond
  
  for.end:
    ret i32 0
  }

This is the case where the exit block and the loop latch don't match and a pre-increment is used, and I'm currently only recomputing the flags in the post-increment case. When I tried to extend it to that case I ran into a snag with SCEV seemingly generating incorrect nowrap flags:

  %iv = phi i32 [ -2147483648, %entry ], [ %iv.inc, %always_taken ]
  -->  {-2147483648,+,1}<nsw><%for.cond> U: [-2147483648,21) S: [-2147483648,21)Exits: 20		LoopDispositions: { %for.cond: Computable }
  %iv2 = phi i32 [ 0, %entry ], [ %iv2.inc, %always_taken ]
  -->  {0,+,1}<nuw><nsw><%for.cond> U: [0,-2147483648) S: [0,-2147483648)	Exits: -2147483628		LoopDispositions: { %for.cond: Computable }
  %iv.inc = add nsw i32 %iv, 1
  -->  {-2147483647,+,1}<nsw><%for.cond> U: [-2147483647,22) S: [-2147483647,22)Exits: 21		LoopDispositions: { %for.cond: Computable }
  %iv2.inc = add nsw i32 %iv2, 1
  -->  {1,+,1}<nuw><%for.cond> U: [1,-2147483626) S: [1,-2147483626)		Exits: -2147483627		LoopDispositions: { %for.cond: Computable }

Here `%iv2` is `{0,+,1}<nuw><nsw>`, while it should be `{0,+,1}<nuw>`, unless I'm misunderstanding something about the validity constraints on these flags (`%iv2.inc` is correct though). I haven't looked into what happens here yet.

So, while I think that the switch from pre-inc to post-inc might be the part that is most problematic in practice (and prevents us from enabling CVP optimizations), it's not really the fundamental problem in LFTR (which is reusing an IV while ignoring poison flags).


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D60935/new/

https://reviews.llvm.org/D60935





More information about the llvm-commits mailing list