[PATCH] D98512: [LoopVectorize] Simplify scalar cost calculation in getInstructionCost

Thu Mar 18 00:46:17 PDT 2021

fhahn added a comment.

In D98512#2626004 <https://reviews.llvm.org/D98512#2626004>, @david-arm wrote:

> In D98512#2625736 <https://reviews.llvm.org/D98512#2625736>, @fhahn wrote:
>
>> In D98512#2625701 <https://reviews.llvm.org/D98512#2625701>, @david-arm wrote:
>>
>>> It looks like the only change is the IV update which goes from 4 -> 1. I think the adds you mentioned are included as part of the load/store cost.
>>
>> I am not sure. From a quick look at the code, it seems like the high load/store cost comes mostly from the required scalarization of the stored/loaded values and the cost of the extra increments seems not fully included at least on the IR instruction level. Unfortunately I don't think there's really any different option to write a test that hits the code paths in the patch, even if the scalarized values are used in other places later on. So it's probably not worth worrying too much about it. I still think we should at least have an assertion as mentioned above, to catch the gap if the cost-modeling is improved.
>
> Hi @fhahn  I guess I'm just a bit confused as to what assert you're asking for here to be honest. We do reach the Add case for the scalar IV increment (that stays scalar after vectorisation) that happens in your test case above, where VF > 1. What assert are you proposing here? Something like:
>
>   assert(VF == 1 || !isScalarAfterVectorization(Instruction)); ?
>
> If so, I think that this is guaranteed to fail with your test case. Alternatively, something like this should work I think:
>
>   assert(VF == 1 || !InstsToScalarize.count(Instruction));
>
> since scalarised vector instructions should not get into this function.

Yes I think if we would use ` !isScalarAfterVectorization(Instruction)` in the assertion, we would at least need to also allow VF != 1 for induction updates, but that's going to be a bit hard to check. So checking `InstsToScalarize` is probably the best option for now

> In your test above the other adds that appear near the start of the loop after vectorisation I believe are due to expansion of the GEP during vectorisation, rather than at the cost phase. The cost for the GEP is explicitly 0 in getInstructionCost and the comment says that the cost is calculated as part of the load/store cost. The purpose of this patch isn't really to fix/improve the cost calculation of loads/stores by taking the extra generated adds into account - it's mainly just to simplify the calculation of costs for the few cases where isScalarAfterVectorization returns true.

That should be fine. Could you also add a comment explaining why we do not need to scale if `isScalarAfterVectorization` currently.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98512/new/

https://reviews.llvm.org/D98512