[PATCH] Prefer blendps over insertps codegen for one special case [X86]

Quentin Colombet qcolombet at apple.com
Fri Mar 20 11:23:35 PDT 2015


================
Comment at: lib/Target/X86/X86ISelLowering.cpp:10520
@@ +10519,3 @@
+      const Function *F = DAG.getMachineFunction().getFunction();
+      bool OptForSize = F->hasFnAttribute(Attribute::OptimizeForSize);
+      if (IdxVal == 0 && (!OptForSize || !MayFoldLoad(N1))) {
----------------
Instead of checking for OptimizeForSize, I would check for MinSize or both.

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:10521
@@ +10520,3 @@
+      bool OptForSize = F->hasFnAttribute(Attribute::OptimizeForSize);
+      if (IdxVal == 0 && (!OptForSize || !MayFoldLoad(N1))) {
+        // If this is an insertion of 32-bits into the low 32-bits of
----------------
As soon as there is a folding opportunity, shouldn’t it be better to use it?
Could you check that with IACA?

http://reviews.llvm.org/D8332

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/






More information about the llvm-commits mailing list