[llvm] r214963 - Add a new option -run-slp-after-loop-vectorization.

Sun Aug 10 20:33:00 PDT 2014

Dito for x86-64.

-Gerolf

On Aug 7, 2014, at 2:29 PM, Gerolf Hoflehner <ghoflehner at apple.com> wrote:

> CINT2006 looks fine on ARM64. Performance changes are within the noise with a small uptick favoring the change.
> 
> <PastedGraphic-3.pdf>
> 
> On Aug 6, 2014, at 2:00 PM, Gerolf Hoflehner <ghoflehner at apple.com> wrote:
> 
>> That looks interesting. I’ll kick off an initial set of test runs on x86-64 and ARM64 for CINT2006 O3 LTO on the ref input sets. It would be great if we could test more HPC workloads, though. Does anyone have ideas/benchmark setup to try? On CINT2006 I only expect libquantum and hmmer to be sensitive to the change.
>> 
>> Cheers
>> Gerolf
>> 
>> On Aug 6, 2014, at 5:56 AM, James Molloy <james.molloy at arm.com> wrote:
>> 
>>> Author: jamesm
>>> Date: Wed Aug  6 07:56:19 2014
>>> New Revision: 214963
>>> 
>>> URL: http://llvm.org/viewvc/llvm-project?rev=214963&view=rev
>>> Log:
>>> Add a new option -run-slp-after-loop-vectorization.
>>> 
>>> This swaps the order of the loop vectorizer and the SLP/BB vectorizers. It is disabled by default so we can do performance testing - ideally we want to change to having the loop vectorizer running first, and the SLP vectorizer using its leftovers instead of the other way around.
>>> 
>>> 
>>> Modified:
>>>  llvm/trunk/lib/Transforms/IPO/PassManagerBuilder.cpp
>>> 
>>> Modified: llvm/trunk/lib/Transforms/IPO/PassManagerBuilder.cpp
>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/IPO/PassManagerBuilder.cpp?rev=214963&r1=214962&r2=214963&view=diff
>>> ==============================================================================
>>> --- llvm/trunk/lib/Transforms/IPO/PassManagerBuilder.cpp (original)
>>> +++ llvm/trunk/lib/Transforms/IPO/PassManagerBuilder.cpp Wed Aug  6 07:56:19 2014
>>> @@ -57,6 +57,13 @@ static cl::opt<bool> RunLoadCombine("com
>>>                                   cl::Hidden,
>>>                                   cl::desc("Run the load combining pass"));
>>> 
>>> +static cl::opt<bool>
>>> +RunSLPAfterLoopVectorization("run-slp-after-loop-vectorization",
>>> +  cl::init(false), cl::Hidden,
>>> +  cl::desc("Run the SLP vectorizer (and BB vectorizer) after the Loop "
>>> +           "vectorizer instead of before"));
>>> +
>>> +
>>> PassManagerBuilder::PassManagerBuilder() {
>>>   OptLevel = 2;
>>>   SizeLevel = 0;
>>> @@ -227,21 +234,23 @@ void PassManagerBuilder::populateModuleP
>>> 
>>> if (RerollLoops)
>>>   MPM.add(createLoopRerollPass());
>>> -  if (SLPVectorize)
>>> -    MPM.add(createSLPVectorizerPass());   // Vectorize parallel scalar chains.
>>> -
>>> -  if (BBVectorize) {
>>> -    MPM.add(createBBVectorizePass());
>>> -    MPM.add(createInstructionCombiningPass());
>>> -    addExtensionsToPM(EP_Peephole, MPM);
>>> -    if (OptLevel > 1 && UseGVNAfterVectorization)
>>> -      MPM.add(createGVNPass());           // Remove redundancies
>>> -    else
>>> -      MPM.add(createEarlyCSEPass());      // Catch trivial redundancies
>>> -
>>> -    // BBVectorize may have significantly shortened a loop body; unroll again.
>>> -    if (!DisableUnrollLoops)
>>> -      MPM.add(createLoopUnrollPass());
>>> +  if (!RunSLPAfterLoopVectorization) {
>>> +    if (SLPVectorize)
>>> +      MPM.add(createSLPVectorizerPass());   // Vectorize parallel scalar chains.
>>> +
>>> +    if (BBVectorize) {
>>> +      MPM.add(createBBVectorizePass());
>>> +      MPM.add(createInstructionCombiningPass());
>>> +      addExtensionsToPM(EP_Peephole, MPM);
>>> +      if (OptLevel > 1 && UseGVNAfterVectorization)
>>> +        MPM.add(createGVNPass());           // Remove redundancies
>>> +      else
>>> +        MPM.add(createEarlyCSEPass());      // Catch trivial redundancies
>>> +
>>> +      // BBVectorize may have significantly shortened a loop body; unroll again.
>>> +      if (!DisableUnrollLoops)
>>> +        MPM.add(createLoopUnrollPass());
>>> +    }
>>> }
>>> 
>>> if (LoadCombine)
>>> @@ -263,6 +272,26 @@ void PassManagerBuilder::populateModuleP
>>> // as function calls, so that we can only pass them when the vectorizer
>>> // changed the code.
>>> MPM.add(createInstructionCombiningPass());
>>> +
>>> +  if (RunSLPAfterLoopVectorization) {
>>> +    if (SLPVectorize)
>>> +      MPM.add(createSLPVectorizerPass());   // Vectorize parallel scalar chains.
>>> +
>>> +    if (BBVectorize) {
>>> +      MPM.add(createBBVectorizePass());
>>> +      MPM.add(createInstructionCombiningPass());
>>> +      addExtensionsToPM(EP_Peephole, MPM);
>>> +      if (OptLevel > 1 && UseGVNAfterVectorization)
>>> +        MPM.add(createGVNPass());           // Remove redundancies
>>> +      else
>>> +        MPM.add(createEarlyCSEPass());      // Catch trivial redundancies
>>> +
>>> +      // BBVectorize may have significantly shortened a loop body; unroll again.
>>> +      if (!DisableUnrollLoops)
>>> +        MPM.add(createLoopUnrollPass());
>>> +    }
>>> +  }
>>> +
>>> addExtensionsToPM(EP_Peephole, MPM);
>>> MPM.add(createCFGSimplificationPass());
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>> 
>