[PATCH] D144729: [AMDGPU] Select v_sat_pk_u8_i16
Pierre van Houtryve via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Mar 30 05:07:43 PDT 2023
Pierre-vh added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/SIInstructions.td:2931
+ def: GCNPat<
+ (v2i16 (DivergentBinFrag<build_vector> (clamp_s16_u8 i16:$lo), (clamp_s16_u8 i16:$hi))),
+ (inst
----------------
Pierre-vh wrote:
> foad wrote:
> > Looking at this again, I don't think these patterns match what the instruction does. The instruction puts the two 8-bit results in bits [15..8] and [7..0], not in bits [23..16] and [7..0].
> Oh I see, right. Not sure what the right pattern is then. All of the patterns are wrong in that case.
> Maybe it needs to match an additional trunc to v2i8 after the build_vector?
I took a look and adding a trunc <2xi16> to <2xi8> causes the following:
- DAG uses bitwise operations instead in the first testcase, I think that can still be matched easily.
- GISel on the other hand doesn't seem to pack the values in 16 bits and returns 2 vgprs each containing 8 bits.
- Without the trunc though it still packs both 16 bit values in one register.
I think some GISel work may be needed, or we shouldn't use a trunc.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D144729/new/
https://reviews.llvm.org/D144729
More information about the llvm-commits
mailing list