[llvm-dev] LoopVectorizer: Should the cost-model be used for legalisation?

Mon Jun 14 08:05:36 PDT 2021

On 2021-06-14 3:14 p.m., Sander De Smalen wrote:
> Hi Vineet,
>
>> IIUC, for option 1, unless the legalizer and the cost-model follow the same logic to determine the feasibility of scalable-vectorization, it seems inaccurate to assert that if something is legal it must have a valid cost.
> I actually consider that one of the main benefits, because this way the legaliser and cost-model keep each other in check. I've already identified several gaps in the Loop Vectorizer and in the CostModel because of this, which we wouldn't as easily have found if the LV ignored the VFs due to Invalid costs. We're now forced to fix these issues early, instead of going through each loop and trying to understand if the cost is Invalid because that was a case we purposely wanted to avoid vectorising, or whether it's a genuine gap in the cost-model/LV that we never got around to fix.
While I agree with your point from an implementation perspective, my 
comment was more from a theoretical POV with an assumption that the 
cost-model and legalizer are already correct.
>> But if that assertion is relaxed, then as you mentioned in another reply, it would be a  de facto choice for option 2. If we don't relax this assertion then for it to be accurate legalizer and cost-model will both do redundant work.
> While there needs to be agreement between both models on what is legal/valid, at runtime no redundant work is being done. If the legalisation phase discards scalable auto-vectorisation as a legal/feasible option, then it will not pursue to cost those VFs (and so work done upfront prevents redundant work done later).

Perhaps, the assertion we want in the legalizer is that if an 
instruction is illegal then it cannot have a valid cost. A legal 
instruction might still have an invalid/prohibitively expensive cost. 
This way we get a model where the legalizer does a quick weed-out of 
illegal instructions and cost-model can do a more fine-grained analysis.

When you say "at runtime no redundant work is being done", does it mean 
that at the implementation level there might be some duplication of logic?

>> Also, IIRC, an invalid cost also models a too-expensive-to-be-useful operation but that doesn't really imply illegal operation.
> I'd say that is a general misconception about the Invalid state of InstructionCost, which has persisted a bit since the early stages when it wasn't yet clear how to interpret the meaning of Invalid. The state Invalid means "cannot possibly handle this operation". If it would have been possible to code-generate (albeit expensive), the cost-model should use a high-but-valid cost instead. I agree this isn't well described in the class' DoxyGen comments, so I'll create a patch to clarify its wording.
Looking at the code, the `InstructionCost` class does not yet model a 
separate prohibitively-high-but-valid cost, right? IIRC, some places use 
a hard-coded value to represent a high cost.
>> I am not sure if this makes sense, but with option 2, since VPlans are not discarded upfront there is a chance that a transofrmation on VPlan might actually make it feasible.
> That's an interesting point, David Green mentioned VPlan-to-Vplan transforms in this context as well recently. Do you have any examples of illegal VPlans that could be transformed into legal VPlans? It may be a lack of imagination, but I can't quickly think of any transformations that would ever change an illegal VPlan with scalable vectors into a valid VPlan.
Hmmm, to be honest I can't really imagine a concrete example here. Just 
thinking out loud here, wondering if there be a case where a 
non-predicated instruction is invalid but a VP version of it is valid, 
in which case the VPlan can transform into using a VP version. May be a 
far-fetched example would be a VP instruction with a fixed EVL that 
makes it feasible to compute its cost.
>
> Thanks,
>
> Sander
>
>> On 11 Jun 2021, at 19:36, Vineet Kumar <vineet.kumar at bsc.es> wrote:
>>
>> Hi,
>>
>> IIUC, for option 1, unless the legalizer and the cost-model follow the same logic to determine the feasibility of scalable-vectorization, it seems inaccurate to assert that if something is legal it must have a valid cost. But if that assertion is relaxed, then as you mentioned in another reply, it would be a  de facto choice for option 2. If we don't relax this assertion then for it to be accurate legalizer and cost-model will both do redundant work. Also, IIRC, an invalid cost also models a too-expensive-to-be-useful operation but that doesn't really imply illegal operation.
>>
>> I am not sure if this makes sense, but with option 2, since VPlans are not discarded upfront there is a chance that a transofrmation on VPlan might actually make it feasible.
>>
>> I didn't think deeply about this so I might be missing some obvious and key facts. Please correct me if I am wrong.
>>
>>
>> Best,
>>
>> Vineet
>>
>>
>> On 2021-06-10 10:50 p.m., Sander De Smalen via llvm-dev wrote:
>>> Hi,
>>>
>>> Last year we added the InstructionCost class which adds the ability to
>>> represent that an operation cannot be costed, i.e. operations that cannot
>>> be expanded by the code-generator will have an invalid cost.
>>>
>>> We started using this information in the Loop Vectorizer for scalable
>>> auto-vectorization. The LV has a legality- and a cost-model stage, which are
>>> conceptually separate concepts with different purposes. But with the
>>> introduction of having valid/invalid costs it's more inviting to use the
>>> cost-model as 'legalisation', which leads us to the following question:
>>>
>>>     Should we be using the cost-model to do legalisation?
>>>
>>> 'Legalisation' in this context means asking the question beforehand if the
>>> code-generator can handle the IR emitted from the LV. Examples of
>>> operations that need such legalisation are predicated divides (at least
>>> until we can use the llvm.vp intrinsics), or intrinsic calls that have no
>>> scalable-vector equivalent. For fixed-width vectors this legalisation issue
>>> is mostly moot, since operations on fixed-width vectors can be scalarised.
>>> For scalable vectors this is neither supported nor feasible [1].
>>>
>>> This means there's the option to do one of two things:
>>> [Option 1]
>>>
>>> Add checks to the LV legalisation to see if scalable-vectorisation is
>>> feasible. If so, assert the cost must be valid. Otherwise discard scalable
>>> VFs as possible candidates.
>>>   * This has the benefit that the compiler can avoid
>>>     calculating/considering VPlans that we know cannot be costed.
>>>   * Legalisation and cost-model keep each other in check. If something
>>>     cannot be costed then either the cost-model or legalisation was
>>>     incomplete.
>>>
>>>
>>> [Option 2]
>>>
>>> Leave the question about legalisation to the CostModel, i.e. if the
>>> CostModel says that <operation> for `VF=vscale x N` is Invalid, then avoid
>>> selecting that VF.
>>>   * This has the benefit that we don't need to do work up-front to
>>>     discard scalable VFs, keeping the LV design simpler.
>>>   * This makes gaps in the cost-model more difficult to spot.
>>>
>>> Note that it's not useful to combine Option 1 and Option 2, because having
>>> two ways to choose from takes away the need to do legalisation beforehand,
>>> and so that's basically a choice for Option 2.
>>>
>>> Both approaches lead to the same end-result, but we currently have a few
>>> patches in flight that have taken Option 1, and this led to some questions
>>> about the approach from both Florian and David Green. So we're looking to
>>> reach to a consensus and decision on what way to move forward.
>>>
>>> I've tentatively added this as a topic to the agenda of the upcoming LLVM
>>> SVE/Scalable Vector Sync-up meeting next Tuesday (June 15th, [2]) as an
>>> opportunity to discuss this more freely if we can get enough people who
>>> actively work on the LV together in that meeting (like Florian and David,
>>> although please forward to anyone else who might have input on this).
>>>
>>> Thanks,
>>>
>>> Sander
>>>
>>>
>>> [1] Expanding the vector operation into a scalarisation loop is currently
>>>      not supported. It could be done, but we have done extensive
>>>      experimentation with loops that handle each element of a scalable
>>>      vector sequentially, but this has never proved beneficial, even when
>>>      using special instructions to efficiently increment the predicate
>>>      vector. I doubt this will be any different for other scalable vector
>>>      architectures, because of the loop control overhead. Also the
>>>      insertion/extraction of elements from a scalable vector is unlikely to
>>>      be as cheap as for fixed-width vectors.
>>>
>>> [2] https://docs.google.com/document/d/1UPH2Hzou5RgGT8XfO39OmVXKEibWPfdYLELSaHr3xzo/edit?usp=sharing
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> http://bsc.es/disclaimer