[PATCH] D48193: [LoopVectorizer] Use an interleave count of 1 when using a vector library call

Thu Jun 14 15:00:31 PDT 2018

rob.lougher added a comment.

In https://reviews.llvm.org/D48193#1132940, @dcaballe wrote:

> Hi Robert,
>
> thanks for bringing this up! This approach is blindly setting the interleave factor to 1 when there are vector math function calls. I have the following questions/comments:

Thanks for responding.  Your questions are ones which I considered while doing the patch, so I suspected I would be asked them...

> 1. Maybe I'm missing something but, wouldn't the same problem happen when the function calls are scalar or for any arbitrary function call (not necessarily math functions)? Why should we do this for vector math function calls only?

Legalization doesn't allow arbitrary function calls (it must be an intrinsic or a library call).  But yes, a vector call may be scalarized, or a widened intrinsic may be lowered back to scalar library calls.  But in this case we're still going to be spilling/reloading all over the place with an IC of 1.  The point for doing it for vector library calls, it that currently it generates poor code and we can fix it for them with a simple change.

> 1. I'm concerned about this change introducing performance regressions. For example, imaging a loop body where the total gain of interleaving overcomes the penalty of the register spilling caused by the function call. Wouldn't it be better to properly model this particular register spilling penalty in the context of function calls instead of blindly disabling interleaving for those cases?

>From what I can see the loop vectorizer is conservative, and its model of the target is very basic (see the register usage calculation, and assumptions such as the number of load/store ports being the max interleave count).  Trying to add ABI considerations and spill cost calculation at the level of the loop vectorizer will be difficult.  In the case of vector library calls we can clearly see a codegen issue, and setting IC=1 in this case is conservative.

> Thanks,
> Diego

Repository:
  rL LLVM

https://reviews.llvm.org/D48193