[PATCH] D50480: [LV] Vectorizing loops of arbitrary trip count without remainder under opt for size

Tue Aug 14 16:25:57 PDT 2018

hsaito added a comment.

In https://reviews.llvm.org/D50480#1199900, @reames wrote:

> I have a general question about direction, not specific to this patch.
>
> It seems like we're adding a specific form of predication to the vectorizer in this patch and I know we already have support for various predicated load and store idioms.  What are our plans in terms of supporting more general predication?  For instance, I don't believe we handle loops like the following at the moment:
>  for (int i = 0; i < N; i++) {
>
>   if (unlikely(i > M)) 
>      break;
>   sum += a[i];
>
> }
>
> Can the infrastructure in this patch be generalized to handle such cases?  And if so, are their any specific plans to do so?

Short answer is No.

>From vectorizer perspective, mechanics is quite different. In the Intel compiler (ICC) 18.0, we implemented "#pragma omp simd early_exit", to handle this situation in somewhat more general manner. Hopefully, the syntax will be standardized in the future and more compilers will implement it. There are two ways to think. 1) If the vector condition is not all false (i.e., break is taken for some element), take the break and let scalar code do the unfinished work. 2) If the vector condition is not all false (i.e., break is taken for some element), let vector code
do the unfinished work and then break. ICC's simd early_exit implements the latter. Either way, it's best not to think along the lines of this (rather simple) patch. Please note that even the determination of exit condition often involves speculation, and compiler somehow needs to ensure such speculation is safe (or let the programmer assert like ICC's "simd early_exit"). Simple "if (A[i]>0) break", for example, involves speculation in the vector load of A[i].

Having said that, making VPlan more powerful (like adding a new IF) certainly help lead to the ability to model early_exit situation within the VPlan eventually. From that perspective, it's a baby step forward.

>From our perspective, bringing OpenMP4.5 functionality to LLVM is higher priority than bringing early_exit extension. If anyone wants to work on simd early_exit in LLVM, we are more than happy to share our learning. Please let us know.

> Secondly, are there any plans to enable this approach for anything other than optsize?

If someone has a brilliantly fast masked vector execution unit, that would be a possibility. As a vectorizer person, that would be a dream comes true ---- smaller code, faster compile, and faster execution. Looking forward to hear such a great news.

Repository:
  rL LLVM

https://reviews.llvm.org/D50480