[PATCH] D32714: SLPVectorizer: Clamp slp-min-reg-size to target maximum

Mon May 1 12:08:48 PDT 2017

arsenm created this revision.
Herald added subscribers: tpr, wdng, mzolotukhin.

New AMDGPU hardware has 2 x 16-bit vector operations, so
a vector width of 32-bits. Currently the vector width of 32 is less
than this default of 128, the loop to pick a vector width never
executes.

      

Tests will be included with future backend commits.


https://reviews.llvm.org/D32714

Files:
  lib/Transforms/Vectorize/SLPVectorizer.cpp


Index: lib/Transforms/Vectorize/SLPVectorizer.cpp
===================================================================

--- lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -79,7 +79,7 @@
 ScheduleRegionSizeBudget("slp-schedule-budget", cl::init(100000), cl::Hidden,
     cl::desc("Limit the size of the SLP scheduling region per block"));
 
-static cl::opt<int> MinVectorRegSizeOption(
+static cl::opt<unsigned> MinVectorRegSizeOption(
     "slp-min-reg-size", cl::init(128), cl::Hidden,
     cl::desc("Attempt to vectorize for this register size in bits"));
 
@@ -331,7 +331,7 @@
     else
       MaxVecRegSize = TTI->getRegisterBitWidth(true);
 
-    MinVecRegSize = MinVectorRegSizeOption;
+    MinVecRegSize = std::min(MinVectorRegSizeOption.getValue(), MaxVecRegSize);
   }
 
   /// \brief Vectorize the tree that starts with the elements in \p VL.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D32714.97328.patch
Type: text/x-patch
Size: 897 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170501/84449aac/attachment.bin>