[PATCH] D143786: [X86] Add `TuningPreferShiftShuffle` for when Shifts are preferable to shuffles.

Mon Feb 20 18:02:57 PST 2023

goldstein.w.n marked an inline comment as done.
goldstein.w.n added inline comments.

================
Comment at: llvm/lib/Target/X86/X86.td:532
+// shuffle ports that other avx512 targets have so prefer shuffle with
+// shifts/rotate.
+def TuningPreferShiftShuffle : SubtargetFeature<"faster-shift-than-shuffle",
----------------
RKSimon wrote:
> Maybe rephrase this (e.g. you refer to vprold etc. but not shifts).
> 
> "Prefer lowering shuffles on AVX512 targets (e.g. Skylake Server) to shifts/rotate if they can use more ports than regular shuffles." ?
Added "imm" as prefer to shifts/rotate b.c we don't want var shift, but otherwise done.

================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:38721
+  // We are checking for shuffle match or shift match. Loop twice so we can
+  // order which we try and match first depending on target preference.
+  for (unsigned Order = 0; Order < 2; ++Order) {
----------------
RKSimon wrote:
> Instead of a loop - why not move the shuffle/shift paths into lambdas and then do this - it should be easier to understand:
> ```
> if (Subtarget.hasFasterShiftThanShuffle()) {
>   if (matchUnaryPermuteAsBitShift()) {}
>   if (matchUnaryPermuteAsIntShuffle()) {}
> } else {
>   if (matchUnaryPermuteAsIntShuffle()) {}
>   if (matchUnaryPermuteAsBitShift()) {}
> }
> ```
The reason I prefered loop her is there is a lot of boiler plate for type /target checking that would also need to be copied. Seemed like loop was preferable to ~30 lines of dup code. But your call. LMK.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D143786/new/

https://reviews.llvm.org/D143786