[LLVMdev] Enabling the vectorizer for -Os

Priyendra Deshwal deshwal at scaligent.com
Sat Jun 15 21:33:13 PDT 2013


These look like really awesome results :)

I am using clang/LLVM to JIT some code and intuitively our workloads should
benefit a lot from vectorization. Is there a way to use apply this
optimization to JIT generated code?

Regards,
-- Priyendra



On Tue, Jun 4, 2013 at 8:26 PM, Nadav Rotem <nrotem at apple.com> wrote:

> Hi,
>
> I would like to start a discussion about enabling the loop vectorizer by
> default for -Os. The loop vectorizer can accelerate many workloads and
> enabling it for -Os and -O2 has obvious performance benefits. At the same
> time the loop vectorizer can increase the code size because of two reasons.
> First, to vectorize some loops we have to keep the original loop around in
> order to handle the last few iterations.  Second, on x86 and possibly other
> targets, the encoding of vector instructions takes more space.
>
> The loop vectorizer is already aware of the ‘optsize’ attribute and it
> does not vectorize loops which require that we keep the scalar tail. It
> also does not unroll loops when optimizing for size. It is not obvious but
> there are many cases in which this conservative kind of vectorization is
> profitable.  The loop vectorizer does not try to estimate the encoding size
> of instructions and this is one reason for code growth.
>
> I measured the effects of vectorization on performance and binary size
> using -Os. I measured the performance on a Sandybridge and compiled our
> test suite using -mavx -f(no)-vectorize -Os.  As you can see in the
> attached data there are many workloads that benefit from vectorization.
>  Not as much as vectorizing with -O3, but still a good number of programs.
>  At the same time the code growth is minimal.  Most workloads are
> unaffected and the total code growth for the entire test suite is 0.89%.
>  Almost all of the code growth comes from the TSVC test suite which
> contains a large number of large vectorizable loops.  I did not measure the
> compile time in this batch but I expect to see an increase in compile time
> in vectorizable loops because of the time we spend in codegen.
>
> I am interested in hearing more opinions and discussing more measurements
> by other people.
>
> Nadav
>
>
> .
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130615/b524ce9b/attachment.html>


More information about the llvm-dev mailing list