[PATCH] Prefer blendps over insertps codegen for one special case [X86]
Quentin Colombet
qcolombet at apple.com
Fri Mar 20 11:23:35 PDT 2015
================
Comment at: lib/Target/X86/X86ISelLowering.cpp:10520
@@ +10519,3 @@
+ const Function *F = DAG.getMachineFunction().getFunction();
+ bool OptForSize = F->hasFnAttribute(Attribute::OptimizeForSize);
+ if (IdxVal == 0 && (!OptForSize || !MayFoldLoad(N1))) {
----------------
Instead of checking for OptimizeForSize, I would check for MinSize or both.
================
Comment at: lib/Target/X86/X86ISelLowering.cpp:10521
@@ +10520,3 @@
+ bool OptForSize = F->hasFnAttribute(Attribute::OptimizeForSize);
+ if (IdxVal == 0 && (!OptForSize || !MayFoldLoad(N1))) {
+ // If this is an insertion of 32-bits into the low 32-bits of
----------------
As soon as there is a folding opportunity, shouldn’t it be better to use it?
Could you check that with IACA?
http://reviews.llvm.org/D8332
EMAIL PREFERENCES
http://reviews.llvm.org/settings/panel/emailpreferences/
More information about the llvm-commits
mailing list