[PATCH] D144729: [AMDGPU] Select v_sat_pk_u8_i16

Tue Feb 28 05:24:03 PST 2023

arsenm added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUInstructions.td:297
+  [
+    (i16 (smax (smin $src, (i16 255)), (i16 0))),
+    (i16 (AMDGPUsmed3 $src, (i16 0), (i16 255)))
----------------
Pierre-vh wrote:
> foad wrote:
> > Pierre-vh wrote:
> > > foad wrote:
> > > > Do you also need to match them the other way round: `(smin (smax $src, (i16 0)), (i16 255))`?
> > > I thought so too, but the other way around is always folded to smed3 it seems
> > That raises the question, why aren't both ways folded to smed3?
> I am not sure if this is intentional or if it's a missed opportunity
> @arsenm is there any reason why we can't fold smax/smin into med3 like we do for smin/smax?
We have the two cases handled in IntMed3Pat already. I guess that just wasn't applied to the 16-bit case? IIRC the 16-bit med3 was introduced after 16-bit min/max so it likely got missed whenever that happened

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144729/new/

https://reviews.llvm.org/D144729