[PATCH] D144729: [AMDGPU] Select v_sat_pk_u8_i16

Thu Mar 2 00:41:54 PST 2023

Pierre-vh marked 2 inline comments as done.
Pierre-vh added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUInstructions.td:297
+  [
+    (i16 (smax (smin $src, (i16 255)), (i16 0))),
+    (i16 (AMDGPUsmed3 $src, (i16 0), (i16 255)))
----------------
arsenm wrote:
> arsenm wrote:
> > Pierre-vh wrote:
> > > foad wrote:
> > > > Pierre-vh wrote:
> > > > > foad wrote:
> > > > > > Do you also need to match them the other way round: `(smin (smax $src, (i16 0)), (i16 255))`?
> > > > > I thought so too, but the other way around is always folded to smed3 it seems
> > > > That raises the question, why aren't both ways folded to smed3?
> > > I am not sure if this is intentional or if it's a missed opportunity
> > > @arsenm is there any reason why we can't fold smax/smin into med3 like we do for smin/smax?
> > We have the two cases handled in IntMed3Pat already. I guess that just wasn't applied to the 16-bit case? IIRC the 16-bit med3 was introduced after 16-bit min/max so it likely got missed whenever that happened
> Actually we have a separate Int16Med3Pat which handles both cases?
That's separate, I meant the smed3 DAG op not the instruction.
We only seem to match `min(max(x, K0), K1), K0 < K1 -> med3(x, K0, K1)` currently. Should we also be matching the commuted version?

Then the pattern can just use smed3 directly and doesn't need this PatFrag.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144729/new/

https://reviews.llvm.org/D144729