[PATCH] D144729: [AMDGPU] Select v_sat_pk_u8_i16

Thu Mar 30 05:07:43 PDT 2023

Pierre-vh added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/SIInstructions.td:2931
+  def: GCNPat<
+    (v2i16 (DivergentBinFrag<build_vector> (clamp_s16_u8 i16:$lo), (clamp_s16_u8 i16:$hi))),
+    (inst
----------------
Pierre-vh wrote:
> foad wrote:
> > Looking at this again, I don't think these patterns match what the instruction does. The instruction puts the two 8-bit results in bits [15..8] and [7..0], not in bits [23..16] and [7..0].
> Oh I see, right. Not sure what the right pattern is then. All of the patterns are wrong in that case.
> Maybe it needs to match an additional trunc to v2i8 after the build_vector?
I took a look and adding a trunc <2xi16> to <2xi8> causes the following:
 - DAG uses bitwise operations instead in the first testcase, I think that can still be matched easily.
 - GISel on the other hand doesn't seem to pack the values in 16 bits and returns 2 vgprs each containing 8 bits.
  - Without the trunc though it still packs both 16 bit values in one register.

I think some GISel work may be needed, or we shouldn't use a trunc.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144729/new/

https://reviews.llvm.org/D144729