[PATCH] Adjust the cost of vectorized SHL/SRL/SRA

Fri May 22 03:46:16 PDT 2015

Thanks for looking at this - comments below. I should mention I have some similar work in progress in http://reviews.llvm.org/D9474 and http://reviews.llvm.org/D9645 that is trying to efficiently vectorize more integer shifts.


REPOSITORY
  rL LLVM

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:16421
@@ -16405,1 +16420,3 @@
+  }
+
   if ((VT == MVT::v2i64 && Op.getOpcode() != ISD::SRA) ||
----------------
I think this is out of date - Elena did some refactoring on this recently.

================
Comment at: lib/Target/X86/X86TargetTransformInfo.cpp:255
@@ -254,3 +254,3 @@
     { ISD::SHL,  MVT::v4i64,  4*10 }, // Scalarized.
 
     { ISD::SRL,  MVT::v16i8,  16*10 }, // Scalarized.
----------------
Why is v4i64 declared here?

================
Comment at: lib/Target/X86/X86TargetTransformInfo.cpp:265
@@ -264,3 +264,3 @@
     { ISD::SRA,  MVT::v2i64,  2*10 }, // Scalarized.
 
     // It is not a good idea to vectorize division. We have to scalarize it and
----------------
SSE only has fast variable shift support for uniform values - these cost values surely should reflect the likely cost of general shift.

It would be better to update the TargetTransformInfo::OK_UniformConstantValue tables + code above this to support TargetTransformInfo::OK_UniformValue as well.

http://reviews.llvm.org/D9923

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/