[llvm] [AMDGPU] Fix an issue that wrong index is used in calculation of byte provider when the op is extract_vector_elt (PR #91697)

Shilei Tian via llvm-commits llvm-commits at lists.llvm.org
Fri May 10 11:35:16 PDT 2024


shiltian wrote:

@jrbyrnes I tried your case but didn't find anything wrong. I set a breakpoint at `SIISelLowering.cpp:12068`, which is `case ISD::EXTRACT_VECTOR_ELT`. The break point was hit twice.

Here is when it was hit the first time. It was trying to match `t64: i32 = or # D:1 t61, t63` and the index at this point is 3.

```
SelectionDAG has 45 nodes:
  t0: ch,glue = EntryToken
      t2: i32,ch = CopyFromReg # D:1 t0, Register:i32 %0
      t4: i32,ch = CopyFromReg # D:1 t0, Register:i32 %1
    t13: i64 = build_pair # D:1 t2, t4
  t18: v4i32,ch = load<(load (s128) from %ir.in0, align 4, addrspace 1)> # D:1 t0, t13, undef:i64
      t6: i32,ch = CopyFromReg # D:1 t0, Register:i32 %2
      t8: i32,ch = CopyFromReg # D:1 t0, Register:i32 %3
    t14: i64 = build_pair # D:1 t6, t8
  t19: v4i32,ch = load<(load (s128) from %ir.in1, align 4, addrspace 1)> # D:1 t0, t14, undef:i64
  t25: i32 = extract_vector_elt # D:1 t19, Constant:i32<3>
  t51: i16 = truncate # D:1 t25
      t29: ch = TokenFactor t18:1, t19:1
                  t22: i32 = extract_vector_elt # D:1 t18, Constant:i32<1>
                t36: i32 = srl # D:1 t22, Constant:i32<16>
              t37: i16 = truncate # D:1 t36
            t42: i16 = srl # D:1 t37, Constant:i16<8>
                t52: i32 = srl # D:1 t25, Constant:i32<16>
              t53: i16 = truncate # D:1 t52
            t89: i16 = and # D:1 t53, Constant:i16<-256>
          t74: i16 = or # D:1 t42, t89
        t61: i32 = zero_extend # D:1 t74
              t85: i16 = and # D:1 t51, Constant:i16<255>
              t82: i16 = shl # D:1 t51, Constant:i16<8>
            t83: i16 = or # D:1 t85, t82
          t62: i32 = any_extend # D:1 t83
        t63: i32 = shl # D:1 t62, Constant:i32<16>
      t64: i32 = or # D:1 t61, t63
        t10: i32,ch = CopyFromReg # D:1 t0, Register:i32 %4
        t12: i32,ch = CopyFromReg # D:1 t0, Register:i32 %5
      t15: i64 = build_pair # D:1 t10, t12
    t33: ch = store<(store (s32) into %ir.out0, addrspace 1)> # D:1 t29, t64, t15, undef:i64
  t31: ch = RET_GLUE t33
```

Here is the trace of the byte:

```
t64[3] -> t63[3] -> t62[1] -> t83[1] -> t82[1] -> t51[0] -> t25[0] -> t19[12]
```

This looks correct to me.

Here is when the break point was hit the second time. Still, it was trying to match `t64: i32 = or # D:1 t61, t63`, and the index is 3.

```
SelectionDAG has 45 nodes:
  t0: ch,glue = EntryToken
      t2: i32,ch = CopyFromReg # D:1 t0, Register:i32 %0
      t4: i32,ch = CopyFromReg # D:1 t0, Register:i32 %1
    t13: i64 = build_pair # D:1 t2, t4
  t18: v4i32,ch = load<(load (s128) from %ir.in0, align 4, addrspace 1)> # D:1 t0, t13, undef:i64
      t6: i32,ch = CopyFromReg # D:1 t0, Register:i32 %2
      t8: i32,ch = CopyFromReg # D:1 t0, Register:i32 %3
    t14: i64 = build_pair # D:1 t6, t8
  t19: v4i32,ch = load<(load (s128) from %ir.in1, align 4, addrspace 1)> # D:1 t0, t14, undef:i64
  t25: i32 = extract_vector_elt # D:1 t19, Constant:i32<3>
  t51: i16 = truncate # D:1 t25
      t29: ch = TokenFactor t18:1, t19:1
                t22: i32 = extract_vector_elt # D:1 t18, Constant:i32<1>
              t91: i32 = srl # D:1 t22, Constant:i32<24>
            t92: i16 = truncate # D:1 t91
                t52: i32 = srl # D:1 t25, Constant:i32<16>
              t53: i16 = truncate # D:1 t52
            t89: i16 = and # D:1 t53, Constant:i16<-256>
          t74: i16 = or # D:1 t92, t89
        t61: i32 = zero_extend # D:1 t74
              t85: i16 = and # D:1 t51, Constant:i16<255>
              t82: i16 = shl # D:1 t51, Constant:i16<8>
            t83: i16 = or # D:1 t85, t82
          t62: i32 = any_extend # D:1 t83
        t63: i32 = shl # D:1 t62, Constant:i32<16>
      t64: i32 = or # D:1 t61, t63
        t10: i32,ch = CopyFromReg # D:1 t0, Register:i32 %4
        t12: i32,ch = CopyFromReg # D:1 t0, Register:i32 %5
      t15: i64 = build_pair # D:1 t10, t12
    t33: ch = store<(store (s32) into %ir.out0, addrspace 1)> # D:1 t29, t64, t15, undef:i64
  t31: ch = RET_GLUE t33
```

The trace is still the same:

```
t64[3] -> t63[3] -> t62[1] -> t83[1] -> t82[1] -> t51[0] -> t25[0] -> t19[12]
```

https://github.com/llvm/llvm-project/pull/91697


More information about the llvm-commits mailing list