[llvm] r200211 - [vectorizer] Fix a trivial oversight where we always requested the
Chandler Carruth
chandlerc at gmail.com
Mon Jan 27 03:12:14 PST 2014
Author: chandlerc
Date: Mon Jan 27 05:12:14 2014
New Revision: 200211
URL: http://llvm.org/viewvc/llvm-project?rev=200211&view=rev
Log:
[vectorizer] Fix a trivial oversight where we always requested the
number of vector registers rather than toggling between vector and
scalar register number based on VF. I don't have a test case as
I spotted this by inspection and on X86 it only makes a difference if
your target is lacking SSE and thus has *no* vector registers.
If someone wants to add a test case for this for ARM or somewhere else
where this is more significant, that would be awesome.
Also made the variable name a bit more sensible while I'm here.
Modified:
llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp
Modified: llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp?rev=200211&r1=200210&r2=200211&view=diff
==============================================================================
--- llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp (original)
+++ llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp Mon Jan 27 05:12:14 2014
@@ -4962,9 +4962,9 @@ LoopVectorizationCostModel::selectUnroll
if (TC > 1 && TC < TinyTripCountUnrollThreshold)
return 1;
- unsigned TargetVectorRegisters = TTI.getNumberOfRegisters(true);
- DEBUG(dbgs() << "LV: The target has " << TargetVectorRegisters <<
- " vector registers\n");
+ unsigned TargetNumRegisters = TTI.getNumberOfRegisters(VF > 1);
+ DEBUG(dbgs() << "LV: The target has " << TargetNumRegisters <<
+ " registers\n");
LoopVectorizationCostModel::RegisterUsage R = calculateRegisterUsage();
// We divide by these constants so assume that we have at least one
@@ -4978,7 +4978,7 @@ LoopVectorizationCostModel::selectUnroll
// Next, divide the remaining registers by the number of registers that is
// required by the loop, in order to estimate how many parallel instances
// fit without causing spills.
- unsigned UF = (TargetVectorRegisters - R.LoopInvariantRegs) / R.MaxLocalUsers;
+ unsigned UF = (TargetNumRegisters - R.LoopInvariantRegs) / R.MaxLocalUsers;
// Clamp the unroll factor ranges to reasonable factors.
unsigned MaxUnrollSize = TTI.getMaximumUnrollFactor();
More information about the llvm-commits
mailing list