[LLVMdev] parallel loop metadata simplification
Tobias Grosser
tobias at grosser.es
Sun Mar 3 08:43:10 PST 2013
On 03/03/2013 03:34 PM, Pekka Jääskeläinen wrote:
> On 03/03/2013 02:34 PM, Tobias Grosser wrote:
>> Meaning they are due to an array or pointer access.
>
> What about loop-scope arrays?
>
> void foo(long *A, long b) {
> long i;
>
> #pragma ivdep
> for (i = 0; i < 100; i++) {
> long t[100];
> t[0] = i + 2;
> A[i] = A[i+b] + t[0];
> }
> }
>
> Clang places the alloca for t to the entry block, creating
> a new race condition.
Very good example, indeed. Is there a formal definition of what
#pragma ivdeps means? I see two options here:
1) No memory based dependences at all
We assume all t[*] allocations point to the same memory
location. By defining #pragma ivdep the user states that there
are no memory dependences caused by the t[*] array. In this case
the example above would be an invalid use of '#pragma ivdep'.
The right thing here would be to annotate all loads/stores with the
llvm.mem.* intrinsics.
2) Memory based dependences for private arrays allowed
We assume there will be different instances of t[*], hence there can
not be memory dependences along the t dimension. This is a valid use
of '#pragma ivdep' and the compiler can only vectorize the loop if it
really creates different instances of t.
In this case, we can only annotate loads/stores to t[*] if we ensure
each iteration of i will access a different array t[*]. If there is just
a single memory locations for t[*] which is shared by all loop
iterations, we must not annotate loads/stores to t[*].
> In your example where you moved t outside the loop
> it's a programmer's mistake (icc might vectorize it but the
> results are undefined due to the dependency).
Are you sure about this? How do you come to the conclusion? Is there
some icc documentation? I am very unsure about the semantics of #pragma
ivdeps. Your interpretation makes sense, but I could also imagine that a
compiler is expected to always resolve / understand dependences on
scalar variables. Do we have any example where a compiler miscompiles
code due to scalar dependences that it ignored after #pragma ivdep was
added?
> but here I don't
> think it is. The t array is supposed to be a loop-private variable,
> and each parallel iteration refer to their own isolated instance.
Again, I can follow this intuition. However, it would be good to
formally document the behavior (and to understand and choose a behavior
according to how other compilers interpret #pragma ivdep). Also, if we
follow your interpretation and if clang currently does not make the t
array loop private, it would be incorrect to attach meta-data to loads
and stores that reference the t array. This last point makes me actually
think your interpretation may be difficult to implement. Is it in all
cases possible to figure out if a memory access accesses a loop private
array?
Cheers,
Tobi
More information about the llvm-dev
mailing list