[PATCH] D48193: [LoopVectorizer] Use an interleave count of 1 when using a vector library call

Thu Jun 14 15:30:23 PDT 2018

rob.lougher added a comment.

In https://reviews.llvm.org/D48193#1133031, @rob.lougher wrote:

> In https://reviews.llvm.org/D48193#1132940, @dcaballe wrote:
>
> > Hi Robert,
>

>> 1. I'm concerned about this change introducing performance regressions. For example, imaging a loop body where the total gain of interleaving overcomes the penalty of the register spilling caused by the function call. Wouldn't it be better to properly model this particular register spilling penalty in the context of function calls instead of blindly disabling interleaving for those cases?
> 
> From what I can see the loop vectorizer is conservative, and its model of the target is very basic (see the register usage calculation, and assumptions such as the number of load/store ports being the max interleave count).  Trying to add ABI considerations and spill cost calculation at the level of the loop vectorizer will be difficult.  In the case of vector library calls we can clearly see a codegen issue, and setting IC=1 in this case is conservative.

I forgot to say that in the case of loops without reductions, only small loops are interleaved (i.e. a loop cost less than 20).  For reference, the loop above, (if I remember correctly, I'm not in work now) has a cost of 12 (a call cost of 10 plus 1 each for the store and load). So it's unlikely that a small loop could overcome the cost of spilling, as there's not a lot of room left for extra instructions.

Thanks again for the questions...

>> Thanks,
>> Diego

Repository:
  rL LLVM

https://reviews.llvm.org/D48193