[PATCH] D159036: [AMDGPU] Accept arbitrary sized sources in CalculateByteProvider
Jay Foad via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Oct 25 01:23:26 PDT 2023
foad added a comment.
Here is a test case: F30159866: gs.ll <https://reviews.llvm.org/F30159866>
I compiled with `llc -march=amdgcn -mcpu=gfx1030 gs.ll -o /dev/null -debug` and saw the following, heavily edited:
Initial selection DAG: %bb.5 '_amdgpu_gs_main:.exportVertex'
SelectionDAG has 261 nodes:
...
t148: v2i32 = vselect # D:1 t64, t78, t147
t149: v2i16 = truncate # D:1 t148
...
t250: i16 = extract_vector_elt # D:1 t149, Constant:i32<1>
t251: i32 = zero_extend # D:1 t250
t252: i32 = shl nuw # D:1 t251, Constant:i32<16>
t248: i16 = extract_vector_elt # D:1 t149, Constant:i32<0>
t249: i32 = zero_extend # D:1 t248
t253: i32 = or # D:1 t252, t249
...
Combining: t253: i32 = or # D:1 t252, t249
Creating new node: t262: i64 = bitcast # D:1 t148
Creating new node: t263: i32 = truncate # D:1 t262
... into: t263: i32 = truncate # D:1 t262
Note that `t149` truncates //each element// of `t148` from 32 to 16 bits.
`t253` was extracting the two parts of `t149` and combining them into single `i32`, i.e. it was equivalent to `i32 bitcast t149`.
You've replaced it with `i32 truncate (i64 bitcast t148)` which is equivalent to extracting element 0 of `t148`.
These are clearly not the same thing. I'd like to revert the patch unless you have a quick fix.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D159036/new/
https://reviews.llvm.org/D159036
More information about the llvm-commits
mailing list