[llvm-dev] [LLVM] (RFC) Addition/Support of new Vectorization Pragmas in LLVM

Mon Aug 19 12:33:00 PDT 2019

I think some of the semantics could be implemented using the
"llvm.mem.parallel_loop_access" annotation we already have, modulo the
difficulties mentioned below.

Am Do., 15. Aug. 2019 um 15:06 Uhr schrieb Terry Greyzck via llvm-dev
<llvm-dev at lists.llvm.org>:
>    * Primarily ivdep allows ambiguous dependencies to be ignored, examples:
>        *  p[i] = q[j]
>        *  a[ix[i]] = b[iy[i]]
>        *  a[ix[i]] += 1.0

"ambiguous dependencies" is very vague. Does it mean the compiler has
to do some analysis to detect non-ambiguous dependencies?

When using "llvm.mem.parallel_loop_access", this would mean the
front-end would have to detect which accesses are non-ambiguous and
not annotate them. However, the annotation is for single accesses, not
dependencies. Both "p[i]" and "q[j]" look non-ambiguous individually,
but the vectorizer would have to add a runtime-check and loop
versioning to ensure that these are not aliasing.

>    * ivdep still requires automatic detection of reductions, including
>      multiple homogeneous reductions on a single variable, examples:
>        *  x = x + a[i]
>        *  x = x + a[i]; if ( c[i] > 0.0 ) { x = x + b[i] }

We could leave away the "llvm.mem.parallel_loop_access" for the
LoadInst and StoreInst of the reduction variable, assuming detected
reductions are limited over scalar variables. However, mem2reg/sroa
would remove those memory accesses anyway, including their annotation,
requiring the LoopVectorizer to detect that the resulting PHINode is a
reduction. Mem2reg/sroa/LICM would also do so with non-reductions, and
array elements that are promoted to registers during the execution of
the loop, such that the loop would not be vectorizable.

Michael