[llvm] [AMDGPU] Fix an issue that wrong index is used in calculation of byte provider when the op is extract_vector_elt (PR #91697)
Shilei Tian via llvm-commits
llvm-commits at lists.llvm.org
Fri May 10 11:35:16 PDT 2024
shiltian wrote:
@jrbyrnes I tried your case but didn't find anything wrong. I set a breakpoint at `SIISelLowering.cpp:12068`, which is `case ISD::EXTRACT_VECTOR_ELT`. The break point was hit twice.
Here is when it was hit the first time. It was trying to match `t64: i32 = or # D:1 t61, t63` and the index at this point is 3.
```
SelectionDAG has 45 nodes:
t0: ch,glue = EntryToken
t2: i32,ch = CopyFromReg # D:1 t0, Register:i32 %0
t4: i32,ch = CopyFromReg # D:1 t0, Register:i32 %1
t13: i64 = build_pair # D:1 t2, t4
t18: v4i32,ch = load<(load (s128) from %ir.in0, align 4, addrspace 1)> # D:1 t0, t13, undef:i64
t6: i32,ch = CopyFromReg # D:1 t0, Register:i32 %2
t8: i32,ch = CopyFromReg # D:1 t0, Register:i32 %3
t14: i64 = build_pair # D:1 t6, t8
t19: v4i32,ch = load<(load (s128) from %ir.in1, align 4, addrspace 1)> # D:1 t0, t14, undef:i64
t25: i32 = extract_vector_elt # D:1 t19, Constant:i32<3>
t51: i16 = truncate # D:1 t25
t29: ch = TokenFactor t18:1, t19:1
t22: i32 = extract_vector_elt # D:1 t18, Constant:i32<1>
t36: i32 = srl # D:1 t22, Constant:i32<16>
t37: i16 = truncate # D:1 t36
t42: i16 = srl # D:1 t37, Constant:i16<8>
t52: i32 = srl # D:1 t25, Constant:i32<16>
t53: i16 = truncate # D:1 t52
t89: i16 = and # D:1 t53, Constant:i16<-256>
t74: i16 = or # D:1 t42, t89
t61: i32 = zero_extend # D:1 t74
t85: i16 = and # D:1 t51, Constant:i16<255>
t82: i16 = shl # D:1 t51, Constant:i16<8>
t83: i16 = or # D:1 t85, t82
t62: i32 = any_extend # D:1 t83
t63: i32 = shl # D:1 t62, Constant:i32<16>
t64: i32 = or # D:1 t61, t63
t10: i32,ch = CopyFromReg # D:1 t0, Register:i32 %4
t12: i32,ch = CopyFromReg # D:1 t0, Register:i32 %5
t15: i64 = build_pair # D:1 t10, t12
t33: ch = store<(store (s32) into %ir.out0, addrspace 1)> # D:1 t29, t64, t15, undef:i64
t31: ch = RET_GLUE t33
```
Here is the trace of the byte:
```
t64[3] -> t63[3] -> t62[1] -> t83[1] -> t82[1] -> t51[0] -> t25[0] -> t19[12]
```
This looks correct to me.
Here is when the break point was hit the second time. Still, it was trying to match `t64: i32 = or # D:1 t61, t63`, and the index is 3.
```
SelectionDAG has 45 nodes:
t0: ch,glue = EntryToken
t2: i32,ch = CopyFromReg # D:1 t0, Register:i32 %0
t4: i32,ch = CopyFromReg # D:1 t0, Register:i32 %1
t13: i64 = build_pair # D:1 t2, t4
t18: v4i32,ch = load<(load (s128) from %ir.in0, align 4, addrspace 1)> # D:1 t0, t13, undef:i64
t6: i32,ch = CopyFromReg # D:1 t0, Register:i32 %2
t8: i32,ch = CopyFromReg # D:1 t0, Register:i32 %3
t14: i64 = build_pair # D:1 t6, t8
t19: v4i32,ch = load<(load (s128) from %ir.in1, align 4, addrspace 1)> # D:1 t0, t14, undef:i64
t25: i32 = extract_vector_elt # D:1 t19, Constant:i32<3>
t51: i16 = truncate # D:1 t25
t29: ch = TokenFactor t18:1, t19:1
t22: i32 = extract_vector_elt # D:1 t18, Constant:i32<1>
t91: i32 = srl # D:1 t22, Constant:i32<24>
t92: i16 = truncate # D:1 t91
t52: i32 = srl # D:1 t25, Constant:i32<16>
t53: i16 = truncate # D:1 t52
t89: i16 = and # D:1 t53, Constant:i16<-256>
t74: i16 = or # D:1 t92, t89
t61: i32 = zero_extend # D:1 t74
t85: i16 = and # D:1 t51, Constant:i16<255>
t82: i16 = shl # D:1 t51, Constant:i16<8>
t83: i16 = or # D:1 t85, t82
t62: i32 = any_extend # D:1 t83
t63: i32 = shl # D:1 t62, Constant:i32<16>
t64: i32 = or # D:1 t61, t63
t10: i32,ch = CopyFromReg # D:1 t0, Register:i32 %4
t12: i32,ch = CopyFromReg # D:1 t0, Register:i32 %5
t15: i64 = build_pair # D:1 t10, t12
t33: ch = store<(store (s32) into %ir.out0, addrspace 1)> # D:1 t29, t64, t15, undef:i64
t31: ch = RET_GLUE t33
```
The trace is still the same:
```
t64[3] -> t63[3] -> t62[1] -> t83[1] -> t82[1] -> t51[0] -> t25[0] -> t19[12]
```
https://github.com/llvm/llvm-project/pull/91697
More information about the llvm-commits
mailing list