[PATCH] Add #pragma vectorize enable/disable to LLVM
Arnold Schwaighofer
aschwaighofer at apple.com
Wed Dec 4 09:20:09 PST 2013
On Dec 4, 2013, at 10:21 AM, Renato Golin <renato.golin at linaro.org> wrote:
> Hi nadav,
>
> The intended behaviour is to force vectorization on the presence
> of the flag (either turn on or off), and to continue the behaviour
> as expected in its absence. Tests were added to make sure the all
> cases are covered in opt. No tests were added in other tools with
> the assumption that they should use the PassManagerBuilder in the
> same way.
>
> The pragma metadata is being attached to the same place as other loop
> metadata, but nothing forbids one from attaching it to a function
> (to enable #pragma optimize) or basic blocks (to hint the basic-block
> vectorizers), etc. The logic should be the same all around.
>
> Patches to Clang to produce the metadata will be produced after the
> initial implementation is agreed upon and committed. Patches to other
> vectorizers (such as SLP and BB) will be added once we're happy with
> the pass manager changes.
>
> Semantics of the flags:
>
> 1. #pragma vectorize enable/disable always wins, regardless of flags
> 2. -disable-loop-vectorizer in opt will disable (only vectorizing pragmas)
> 3. -llop-vectorize / -vectorize-loops / -fvectorize (in Clang) will turn it on on all opt levels
> 4. -O2+ and -Os will turn it on by default
>
> We may want 2 to override 1, I'm not sure yet.
>
> Semantics of the pass manager:
>
> * We're now always adding the vectorizer pass with a flag if it has to check or not for pragmas
> * The default is to not check, since opt can be created with the pass via -loop-vectorize, which doesn't pass through AddOptimizationPasses()
> * Clean-up passes are being passed (and ran), even if the vectorizer doesn't vectorize anything, which increases slightly the compilation time (~2% on some tests I did here at O1), which is not ideal (ideas welcome!)
> * The main issue is that I DO want to run them if there was a pragma enable, but this is rare, so would be better to have an alternative solution.
Ultimately, I think, we want to call the functionality that the vectorizer requires from those passes from the vectorizer only on the subset of BB’s modified by the vectorizer (instcombine as a library function ...).
It matters for -O1 and -Oz which did not pay this penalty before. In the short term, I don’t have a good answer here.
- if (!LateVectorize && LoopVectorize)
- MPM.add(createLoopVectorizePass(DisableUnrollLoops));
+ if (!LateVectorize)
+ MPM.add(createLoopVectorizePass(DisableUnrollLoops, LoopVectorize));
Let’s fix this whitespace error while we are here.
I think that “vectorizer.enable” flag should enable “aggressive” vectorization at -Os, i.e disable the size heuristic we use.
Something like:
bool OptForSize = F->getAttributes().hasAttribute(FnIndex, SzAttr) && Force != 1;
Thanks,
Arnold
More information about the llvm-commits
mailing list