[PATCH] D70148: [SLP] fix miscompile on min/max reductions with extra uses (PR43948)

Sanjay Patel via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Nov 19 06:13:54 PST 2019


spatel added a reviewer: echristo.
spatel added a comment.

For reference (this made it to the mailing list, but not Phab), the code example cited for the revert was likely already miscompiling:

  // clang -c -O2 -msse4 repro.cc
  
  using a = void (*)(const void *, long, int *, int *);
  int b(int);
  template <int, typename> void c(const void *, long, int *d, int *) {
    int a[8];
    int e[1];
    int f;
    for (int g = 0; g < 8; g += 2) {
      for (int h = 0; h < 5; ++h)
        a[3] += e;
      for (int h = 0; h < 3; ++h)
        f = b(f) * 2;
    }
    int i = *d = 0;
    for (int g = 0; g < 8; ++g)
      if (a[g] > i) {
        i = a[g];
        *d = g;
      }
  }
  a j;
  void k() { j = c<8, char>; }

I reduced the IR using bugpoint and then cleaned it up manually a bit to still make some sense (bugpoint introduces undefs that would eventually allow reducing the code to nothing):

  define i1 @bad_insertpoint_rdx([8 x i32]* %p) #0 {
  ; CHECK-LABEL: @bad_insertpoint_rdx(
  ; CHECK-NEXT:    [[ARRAYIDX22:%.*]] = getelementptr inbounds [8 x i32], [8 x i32]* [[P:%.*]], i64 0, i64 0
  ; CHECK-NEXT:    [[TMP1:%.*]] = bitcast i32* [[ARRAYIDX22]] to <2 x i32>*
  ; CHECK-NEXT:    [[TMP2:%.*]] = load <2 x i32>, <2 x i32>* [[TMP1]], align 16
  ; CHECK-NEXT:    [[SPEC_STORE_SELECT87:%.*]] = zext i1 undef to i32
  ; CHECK-NEXT:    [[RDX_SHUF:%.*]] = shufflevector <2 x i32> [[TMP2]], <2 x i32> undef, <2 x i32> <i32 1, i32 undef>
  ; CHECK-NEXT:    [[RDX_MINMAX_CMP:%.*]] = icmp sgt <2 x i32> [[TMP2]], [[RDX_SHUF]]
  ; CHECK-NEXT:    [[RDX_MINMAX_SELECT:%.*]] = select <2 x i1> [[RDX_MINMAX_CMP]], <2 x i32> [[TMP2]], <2 x i32> [[RDX_SHUF]]
  ; CHECK-NEXT:    [[TMP3:%.*]] = extractelement <2 x i32> [[RDX_MINMAX_SELECT]], i32 0
  ; CHECK-NEXT:    [[TMP4:%.*]] = icmp sgt i32 [[TMP3]], 0
  ; CHECK-NEXT:    [[OP_EXTRA:%.*]] = select i1 [[TMP4]], i32 [[TMP3]], i32 0
  ; CHECK-NEXT:    [[CMP23_2:%.*]] = icmp sgt i32 [[SPEC_STORE_SELECT87]], [[OP_EXTRA]]
  ; CHECK-NEXT:    ret i1 [[CMP23_2]]
  ;
    %arrayidx22 = getelementptr inbounds [8 x i32], [8 x i32]* %p, i64 0, i64 0
    %t0 = load i32, i32* %arrayidx22, align 16
    %cmp23 = icmp sgt i32 %t0, 0
    %spec.select = select i1 %cmp23, i32 %t0, i32 0
    %arrayidx22.1 = getelementptr inbounds [8 x i32], [8 x i32]* %p, i64 0, i64 1
    %t1 = load i32, i32* %arrayidx22.1, align 4
    %cmp23.1 = icmp sgt i32 %t1, %spec.select
    %spec.store.select87 = zext i1 %cmp23.1 to i32
    %spec.select88 = select i1 %cmp23.1, i32 %t1, i32 %spec.select
    %cmp23.2 = icmp sgt i32 %spec.store.select87, %spec.select88
    ret i1 %cmp23.2
  }

The CHECK lines are based on trunk today (this RAUW patch is not in play). So this example miscompiles independently of this patch - see this line:

  ; CHECK-NEXT:    [[SPEC_STORE_SELECT87:%.*]] = zext i1 undef to i32

The result of the function is based on that value, so we can get the whole thing wrong depending on what we choose for 'undef'.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70148/new/

https://reviews.llvm.org/D70148





More information about the llvm-commits mailing list