[LLVMdev] parallel loop metadata simplification

Tobias Grosser tobias at grosser.es
Sun Mar 3 08:43:10 PST 2013


On 03/03/2013 03:34 PM, Pekka Jääskeläinen wrote:
> On 03/03/2013 02:34 PM, Tobias Grosser wrote:
>> Meaning they are due to an array or pointer access.
>
> What about loop-scope arrays?
>
> void foo(long *A, long b) {
>      long i;
>
>      #pragma ivdep
>      for (i = 0; i < 100; i++) {
>          long t[100];
>          t[0] = i + 2;
>          A[i] = A[i+b] + t[0];
>      }
> }
>
> Clang places the alloca for t to the entry block, creating
> a new race condition.

Very good example, indeed. Is there a formal definition of what
#pragma ivdeps means? I see two options here:

1) No memory based dependences at all

We assume all t[*] allocations point to the same memory
location. By defining #pragma ivdep the user states that there
are no memory dependences caused by the t[*] array. In this case
the example above would be an invalid use of '#pragma ivdep'.

The right thing here would be to annotate all loads/stores with the 
llvm.mem.* intrinsics.

2) Memory based dependences for private arrays allowed

We assume there will be different instances of t[*], hence there can
not be memory dependences along the t dimension. This is a valid use
of '#pragma ivdep' and the compiler can only vectorize the loop if it
really creates different instances of t.

In this case, we can only annotate loads/stores to t[*] if we ensure 
each iteration of i will access a different array t[*]. If there is just 
a single memory locations for t[*] which is shared by all loop 
iterations, we must not annotate loads/stores to t[*].

> In your example where you moved t outside the loop
> it's a programmer's mistake (icc might vectorize it but the
> results are undefined due to the dependency).

Are you sure about this? How do you come to the conclusion? Is there 
some icc documentation? I am very unsure about the semantics of #pragma 
ivdeps. Your interpretation makes sense, but I could also imagine that a 
compiler is expected to always resolve / understand dependences on 
scalar variables. Do we have any example where a compiler miscompiles 
code due to scalar dependences that it ignored after #pragma ivdep was 
added?

 >  but here I don't
> think it is. The t array is supposed to be a loop-private variable,
> and each parallel iteration refer to their own isolated instance.

Again, I can follow this intuition. However, it would be good to 
formally document the behavior (and to understand and choose a behavior 
according to how other compilers interpret #pragma ivdep). Also, if we 
follow your interpretation and if clang currently does not make the t 
array loop private, it would be incorrect to attach meta-data to loads 
and stores that reference the t array. This last point makes me actually
think your interpretation may be difficult to implement. Is it in all 
cases possible to figure out if a memory access accesses a loop private 
array?

Cheers,
Tobi





More information about the llvm-dev mailing list