[RFC] #pragma ivdep

Tobias Grosser tobias at grosser.es
Tue Mar 12 14:04:11 PDT 2013

On 03/12/2013 09:42 PM, Redmond, Paul wrote:
> On 2013-03-07, at 4:57 AM, Tobias Grosser wrote:
>> On 03/07/2013 12:35 AM, Redmond, Paul wrote:
>>> On 2013-03-06, at 3:42 PM, Pekka Jääskeläinen wrote:
>>>> Hi Paul,
>>>> On 03/06/2013 10:11 PM, Redmond, Paul wrote:
>>>>> I have updated the patch to not add metadata on loads and stores where the
>>>>> pointer comes from an alloca. I wonder if the check should be more
>>>>> conservative and only include pointers coming from Arguments and GlobalValues
>>>>> (perhaps Constants too?)
>>>> I think it's safer that way around for now.
>>> Hmm.. I'm not sure that this approach (or my original) is general enough. Here's a simple loop and the corresponding IR as generated by clang:
>> Before taking decisions on what exactly we need to do during code generation, I would appreciate if we could write down the exact semantics of #pragma ivdep on the C level. We should get a list of interesting test cases and investigate what other compilers gcc and icc do in such cases. Pekka and me gave a couple of test cases, that may be a good start.
> I haven't been able to find a detailed description of #pragma ivdep semantics other than "ignore assumed vector dependencies" which is pretty vague.

OK. That's the same I found.

In the absence of good documentation, it seems we need to generate this 
documentation ourselves. What about starting from very conservative 
examples, investigate for those examples what gcc and icc do and follow 
their behavior.

> I think the most basic semantic requirement at the C level is that the loop:
> for (i...) {
>    A;
>    B;
>    C;
>    ...
> }
> may be distributed as (without reordering statements):
> for (i...) A;
> for (i...) B;
> for (i...) C;
> for (i...) ...
> to produce the same result.

I do not get this example. Can A, B, ... be arbitrary statements?

> Regarding the case:
> #pragma ivdep
> for (i = 0; i < 100; i++) {
>    long t[100];
>    t[0] = i + 2;
>    A[i] = A[i+b] + t[0];
> }
> ICC is supposed to privatize t. I'm currently investigating how to handle this during codegen--how to decide when to add the metadata and when not to.

What about not adding meta-data, except if the loads/stores are explicit 
array references in the C code for which it is trivial to prove these 
references where allocated outside of the loop (globals, function 
parameters, stack allocated arrays).

This is obviously very conservative, but it seems to be correct. We 
could commit this support and start to gradually extend this feature.


More information about the cfe-commits mailing list