[llvm] r220345 - LTO: respect command-line options that disable vectorization.

Arnold Schwaighofer aschwaighofer at apple.com
Fri Oct 24 09:16:23 PDT 2014


It is complicated :).

clang turns on/off vectorization through explicitly setting the PassManager::LoopVectorize/SLPVectorize fields. This overrides the cl:opt flags. Changing the cl::opt default won’t have any effect on clang.

Changing the default will only effect LTO because libLTO does not explicitly set the PassManager::LoopVectorize flag but that value gets initialized by the cl::opt value.

I am suggesting changing to:


static cl::opt<bool>
RunLoopVectorization("vectorize-loops", cl::Hidden, cl::init(true),
                     cl::desc("Run the Loop vectorization passes"));

Similar for the slp flag.

We will have to force disabling loop unrolling in the loop vectorizer so that we get the behavior before your patch:

  PM.add(createLoopVectorizePass(true, LoopVectorize));



> On Oct 24, 2014, at 9:03 AM, JF Bastien <jfb at google.com> wrote:
> 
> Yes, cl::opt<bool> defaults to false when there's no init. Changing the default will affect non-LTO too. This may be the right thing to do, but isn't my call. Turning it on only for LTO sounds better IMO, but I'm not sure what you're suggesting: I think there shouldn't be a vectorization flags that are different for LTO and for non-LTO.
> 
> On Fri, Oct 24, 2014 at 8:57 AM, Arnold Schwaighofer <aschwaighofer at apple.com> wrote:
> JF are you sure that “LoopVectorize” is set to true by default by the PassManager instance of libLTO?
> 
> The reason why I forced these parameters to true is that this is not the case if I remember correctly.
> 
> We wanted the default for libLTO to be with vectorization.
> 
> PassManager.cpp:
> 
> static cl::opt<bool>
> RunLoopVectorization("vectorize-loops", cl::Hidden,
>                      cl::desc("Run the Loop vectorization passes"));
> 
> PassManagerBuilder::PassManagerBuilder() {
>     OptLevel = 2;
>     SizeLevel = 0;
>     LibraryInfo = nullptr;
>     Inliner = nullptr;
>     DisableTailCalls = false;
>     DisableUnitAtATime = false;
>     DisableUnrollLoops = false;
>     BBVectorize = RunBBVectorization;
>     SLPVectorize = RunSLPVectorization;
>     LoopVectorize = RunLoopVectorization;
>     RerollLoops = RunLoopRerolling;
>     LoadCombine = RunLoadCombine;
>     DisableGVNLoadPRE = false;
>     VerifyInput = false;
>     VerifyOutput = false;
>     StripDebug = false;
>     MergeFunctions = false;
> }
> 
> LTOCodeGenerator.cpp:
> 
> /// Optimize merged modules using various IPO passes
> bool LTOCodeGenerator::generateObjectFile(raw_ostream &out,
>                                           bool DisableOpt,
>                                           bool DisableInline,
>                                           bool DisableGVNLoadPRE,
>                                           std::string &errMsg) {
>   if (!this->determineTarget(errMsg))
>     return false;
> 
>   Module *mergedModule = IRLinker.getModule();
> 
>   // Mark which symbols can not be internalized
>   this->applyScopeRestrictions();
> 
>   // Instantiate the pass manager to organize the passes.
>   PassManager passes;
> 
>   // Add an appropriate DataLayout instance for this module...
>   mergedModule->setDataLayout(TargetMach->getSubtargetImpl()->getDataLayout());
> 
>   Triple TargetTriple(TargetMach->getTargetTriple());
>   PassManagerBuilder PMB;
>   PMB.DisableGVNLoadPRE = DisableGVNLoadPRE;
>   if (!DisableInline)
>     PMB.Inliner = createFunctionInliningPass();
>   PMB.LibraryInfo = new TargetLibraryInfo(TargetTriple);
>   if (DisableOpt)
>     PMB.OptLevel = 0;
>   PMB.VerifyInput = true;
>   PMB.VerifyOutput = true;
> 
>   PMB.populateLTOPassManager(passes, TargetMach);
> 
> 
> 
> I think cl::opt<bool> defaults to false and your commit effectively disabled vectorization during LTO. We can recover this by changing the default cl::opt flags 'vectorize-loops’ and 'vectorize-slp'  to true. If that does not work (because we make assumption somewhere about the default being false) we can follow the example of “DisableGVNLoadPRE” in LTOCodeGenerator.cpp and a a flag to disable Vectorization during LTO and pass that to the PassManager created in generateObjectFile.
> 
>   PMB.LoopVectorize = !DisableLTOVectorization;
> 
> 
> Thanks,
> Arnold
> 
> > DisableUnrollLoops
> > On Oct 24, 2014, at 5:27 AM, Alexey Volkov <avolkov.intel at gmail.com> wrote:
> >
> > Hi JF,
> >
> > After your commit I saw a performance regression because of disabled Loop Vectorizer:
> >  LV: Not vectorizing: No #pragma vectorize enable.
> > It is really strange since I used -Ofast -flto clang's options to build an application.
> > Before this change loop was successfully vectorized by Loop Vectorizer.
> >
> > Thanks, Alexey.
> >
> > 2014-10-22 3:18 GMT+04:00 JF Bastien <jfb at google.com>:
> > Author: jfb
> > Date: Tue Oct 21 18:18:21 2014
> > New Revision: 220345
> >
> > URL: http://llvm.org/viewvc/llvm-project?rev=220345&view=rev
> > Log:
> > LTO: respect command-line options that disable vectorization.
> >
> > Summary: Patches 202051 and 208013 added calls to LTO's PassManager which unconditionally add LoopVectorizePass and SLPVectorizerPass instead of following the logic in PassManagerBuilder::populateModulePassManager and honoring the -vectorize-loops -run-slp-after-loop-vectorization flags.
> >
> > Reviewers: nadav, aschwaighofer, yijiang
> >
> > Subscribers: llvm-commits
> >
> > Differential Revision: http://reviews.llvm.org/D5884
> >
> > Modified:
> >     llvm/trunk/lib/Transforms/IPO/PassManagerBuilder.cpp
> >
> > Modified: llvm/trunk/lib/Transforms/IPO/PassManagerBuilder.cpp
> > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/IPO/PassManagerBuilder.cpp?rev=220345&r1=220344&r2=220345&view=diff
> > ==============================================================================
> > --- llvm/trunk/lib/Transforms/IPO/PassManagerBuilder.cpp (original)
> > +++ llvm/trunk/lib/Transforms/IPO/PassManagerBuilder.cpp Tue Oct 21 18:18:21 2014
> > @@ -440,10 +440,12 @@ void PassManagerBuilder::addLTOOptimizat
> >    // More loops are countable; try to optimize them.
> >    PM.add(createIndVarSimplifyPass());
> >    PM.add(createLoopDeletionPass());
> > -  PM.add(createLoopVectorizePass(true, true));
> > +  PM.add(createLoopVectorizePass(DisableUnrollLoops, LoopVectorize));
> >
> >    // More scalar chains could be vectorized due to more alias information
> > -  PM.add(createSLPVectorizerPass()); // Vectorize parallel scalar chains.
> > +  if (RunSLPAfterLoopVectorization)
> > +    if (SLPVectorize)
> > +      PM.add(createSLPVectorizerPass()); // Vectorize parallel scalar chains.
> >
> >    // After vectorization, assume intrinsics may tell us more about pointer
> >    // alignments.
> >
> >
> > _______________________________________________
> > llvm-commits mailing list
> > llvm-commits at cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> >
> >
> >
> > --
> > Alexey Volkov
> > Intel Corporation
> > _______________________________________________
> > llvm-commits mailing list
> > llvm-commits at cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> 
> 





More information about the llvm-commits mailing list