[llvm] [CostModel][X86] Attempt to match cheap v4f32 shuffles that map to SHUFPS instruction (PR #121778)

Simon Pilgrim via llvm-commits llvm-commits at lists.llvm.org
Mon Jan 6 08:57:25 PST 2025


================
@@ -2226,9 +2226,18 @@ InstructionCost X86TTIImpl::getShuffleCost(
     { TTI::SK_PermuteTwoSrc,    MVT::v4f32, 2 }, // 2*shufps
   };
 
-  if (ST->hasSSE1())
+  if (ST->hasSSE1()) {
+    if (LT.first == 1 && LT.second == MVT::v4f32 && Mask.size() == 4) {
----------------
RKSimon wrote:

Not for any element type other than f32, which is why I added the Mask.size check - I did consider supporting v2f32/v3f32 handling by widening the mask with PoisonMaskElem but didn't find much need for it - wdyt?

https://github.com/llvm/llvm-project/pull/121778


More information about the llvm-commits mailing list