[PATCH] D159036: [AMDGPU] Accept arbitrary sized sources in CalculateByteProvider

Wed Oct 25 01:23:26 PDT 2023

foad added a comment.

Here is a test case: F30159866: gs.ll <https://reviews.llvm.org/F30159866>
I compiled with `llc -march=amdgcn -mcpu=gfx1030 gs.ll -o /dev/null -debug` and saw the following, heavily edited:

  Initial selection DAG: %bb.5 '_amdgpu_gs_main:.exportVertex'
  SelectionDAG has 261 nodes:
  ...
      t148: v2i32 = vselect # D:1 t64, t78, t147
    t149: v2i16 = truncate # D:1 t148
  ...
              t250: i16 = extract_vector_elt # D:1 t149, Constant:i32<1>
            t251: i32 = zero_extend # D:1 t250
          t252: i32 = shl nuw # D:1 t251, Constant:i32<16>
            t248: i16 = extract_vector_elt # D:1 t149, Constant:i32<0>
          t249: i32 = zero_extend # D:1 t248
        t253: i32 = or # D:1 t252, t249
  ...
  Combining: t253: i32 = or # D:1 t252, t249
  Creating new node: t262: i64 = bitcast # D:1 t148
  Creating new node: t263: i32 = truncate # D:1 t262
   ... into: t263: i32 = truncate # D:1 t262

Note that `t149` truncates //each element// of `t148` from 32 to 16 bits.
`t253` was extracting the two parts of `t149` and combining them into single `i32`, i.e. it was equivalent to `i32 bitcast t149`.
You've replaced it with `i32 truncate (i64 bitcast t148)` which is equivalent to extracting element 0 of `t148`.
These are clearly not the same thing. I'd like to revert the patch unless you have a quick fix.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D159036/new/

https://reviews.llvm.org/D159036