[PATCH] #pragma vectorize

Tue Apr 22 10:45:28 PDT 2014

----- Original Message -----
> From: "Tyler Nowicki" <tnowicki at apple.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "Nadav Rotem" <nrotem at apple.com>, cfe-commits at cs.uiuc.edu, "Alexey Bataev" <alexey.bataev at intel.com>, "Alexander
> Musman" <alexander.musman at gmail.com>, "Chandler Carruth" <chandlerc at google.com>
> Sent: Tuesday, April 22, 2014 12:16:54 PM
> Subject: Re: [PATCH] #pragma vectorize
> 
> 
> Hi Hal,
> 
> 
> 
> Thank you for the review!
> 
> 
> 1) I don’t agree with using a single unroll syntax to control
> multiple optimizations. If a user is adding optimization hint
> pragmas then they are certainly drilling down into the algorithms
> overriding “safe” values to get the best performance. We need to
> make sure optimization hints result in predictable changes to the
> generated code.

Okay. As I said in the reply to Nadav, I'm fine with having a 'pragma widen' for this and a 'pragma unroll' for the concatenation unrolling.

> 
> 
> 2) Regarding the syntax you describe: #pragma vectorize
> unroll(_value_) is equivalent to #pragma unroll(unsequenced _value_)
> Is that right?
> 
> 
> But what happens when you specify something like:
> 
> #pragma unroll(unsequenced disable)
> #pragma vectorize width(_value_)
> 
> 
> What should happen here? Normally specifying width should enable
> vectorization unless `#pragma vectorize disable' is specified. But
> here only the unroll is disabled? I find this very confusing.

This is not at all confusing ;) -- There are two different optimizations here that can be applied separately:

 1. Widening for ILP (which is disabled in your example)
 2. Vectorization (which is enabled by placing <_value_ x type> values into the SIMD registers

> 
> 
> 3) As Nadav pointed out the `unroll’ parameter actually controls the
> `WidenMap’ that is used for vectorization/unrolling in the loop
> vectorizer. So what I call `vectorize unroll’ is more complicated
> than simple loop unrolling.

I understand; but we need to make clear that the widening for ILP is not vectorization. They are separate optimizations, and one can be applied without the other (and often is!).

 -Hal

> 
> 
> Tyler
> 
> 
> 
> 
> 
> On Apr 22, 2014, at 9:22 AM, Hal Finkel < hfinkel at anl.gov > wrote:
> 
> 
> 
> Tyler,
> 
> Thanks for working on this!
> 
> I feel strongly that we should separate the unrolling pragma from the
> vectorization pragma. The fact that modulo unrolling is implemented
> by our loop vectorizer is an implementation detail that I do not
> want to expose to our users directly. Also, we have a concatenation
> unroller which performs unrolling separate from the vectorizer. I
> think we should do something like this:
> 
> 1. For the purpose of this patch, please split off the unrolling into
> a separate pragma:
> #pragma unroll(_value_ | enable | disable)
> 
> 2. In the future, this syntax will be enhanced to something like
> this:
> #pragma unroll(unroll-spec-list)
> 
> unroll-spec-list:
> kind_prefix_opt unroll-spec
> 
> unroll-spec:
> _value_
> enable
> disable
> 
> kind_prefix:
> kind :
> 
> kind:
> sequenced :
> unsequenced :
> any :
> 
> [this sequenced vs unsequenced terminology is what we decided we
> liked for the parallel algorithms library being considered in WG21,
> and I think it applies just as well here]
> 
> In our implementation, 'unsequenced' unrolling means the modulo
> unrolling performed by the loop vectorizer. 'sequenced' unrolling
> means the concatenation unrolling performed by the generic unroller.
> 
> Adding Chandler, he might have some opinion on my use of the
> sequenced vs. unsequenced suggestion.
> 
> Also, adding Alexey and Alexander who have done some similar work in
> clang-omp.
> 
> -Hal
> 
> ----- Original Message -----
> 
> 
> From: "Tyler Nowicki" < tnowicki at apple.com >
> To: cfe-commits at cs.uiuc.edu
> Cc: "Nadav Rotem" < nrotem at apple.com >
> Sent: Monday, April 21, 2014 6:23:02 PM
> Subject: [PATCH] #pragma vectorize
> 
> 
> 
> Hi,
> 
> Please review the attached patch for adding pragma vectorize syntax /
> vectorization hints to clang.
> 
> pragma vectorize
> * supports the options enable, disable, unroll(_value_), and
> width(_value_)
> * options are turned into vectorization hints that are used during
> codegen to add metadata to the conditional branch of the for, while,
> and do-while loops.
> * enable forces the vectorizer to consider the loop, for example when
> compiling with Os
> * disable prevents vectorization of the loop
> * The _value_ specified by unroll(_value_) and width(_value_) must be
> a positive integer. It will be used to set the
> llvm.vectorizer.unroll or llvm.vectorizer.width metadata values.
> 
> Thank you,
> 
> Tyler Nowicki
> Apple
> 
> 
> 
> _______________________________________________
> cfe-commits mailing list
> cfe-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
> 
> 
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory