[llvm-dev] LoopVectorizer: Should the cost-model be used for legalisation?

Thu Jun 10 23:35:23 PDT 2021

Please correct me if I am wrong, but I thought this discussion was brought up by a temporarily workaround in the cost-model, working around current codegen limitations that needs fixing.
I am asking because Option 1 is what we currently have, and I don't see reasons to depart from this general idea, even if the cost-model can return Invalid due to a workaround that would hopefully disappear soon. That would mean the assert that the legalisation and cost-model are in sync would need to be skipped, and while that is not ideal, I don't see that as a big problem and I don't see it as a total departure from Option 1, especially if this is all temporarily.

And does this discussion disappear if the codegen issues are fixed? I don't know the scale of the problem/work, but is it not easier to fix that avoiding this cost-model vs. legalisation discussion?
________________________________
From: llvm-dev <llvm-dev-bounces at lists.llvm.org> on behalf of Sander De Smalen via llvm-dev <llvm-dev at lists.llvm.org>
Sent: 10 June 2021 21:50
To: llvm-dev <llvm-dev at lists.llvm.org>
Subject: [llvm-dev] LoopVectorizer: Should the cost-model be used for legalisation?

Hi,

Last year we added the InstructionCost class which adds the ability to
represent that an operation cannot be costed, i.e. operations that cannot
be expanded by the code-generator will have an invalid cost.

We started using this information in the Loop Vectorizer for scalable
auto-vectorization. The LV has a legality- and a cost-model stage, which are
conceptually separate concepts with different purposes. But with the
introduction of having valid/invalid costs it's more inviting to use the
cost-model as 'legalisation', which leads us to the following question:

   Should we be using the cost-model to do legalisation?

'Legalisation' in this context means asking the question beforehand if the
code-generator can handle the IR emitted from the LV. Examples of
operations that need such legalisation are predicated divides (at least
until we can use the llvm.vp intrinsics), or intrinsic calls that have no
scalable-vector equivalent. For fixed-width vectors this legalisation issue
is mostly moot, since operations on fixed-width vectors can be scalarised.
For scalable vectors this is neither supported nor feasible [1].

This means there's the option to do one of two things:

[Option 1]

Add checks to the LV legalisation to see if scalable-vectorisation is
feasible. If so, assert the cost must be valid. Otherwise discard scalable
VFs as possible candidates.
 * This has the benefit that the compiler can avoid
   calculating/considering VPlans that we know cannot be costed.
 * Legalisation and cost-model keep each other in check. If something
   cannot be costed then either the cost-model or legalisation was
   incomplete.

[Option 2]

Leave the question about legalisation to the CostModel, i.e. if the
CostModel says that <operation> for `VF=vscale x N` is Invalid, then avoid
selecting that VF.
 * This has the benefit that we don't need to do work up-front to
   discard scalable VFs, keeping the LV design simpler.
 * This makes gaps in the cost-model more difficult to spot.

Note that it's not useful to combine Option 1 and Option 2, because having
two ways to choose from takes away the need to do legalisation beforehand,
and so that's basically a choice for Option 2.

Both approaches lead to the same end-result, but we currently have a few
patches in flight that have taken Option 1, and this led to some questions
about the approach from both Florian and David Green. So we're looking to
reach to a consensus and decision on what way to move forward.

I've tentatively added this as a topic to the agenda of the upcoming LLVM
SVE/Scalable Vector Sync-up meeting next Tuesday (June 15th, [2]) as an
opportunity to discuss this more freely if we can get enough people who
actively work on the LV together in that meeting (like Florian and David,
although please forward to anyone else who might have input on this).

Thanks,

Sander

[1] Expanding the vector operation into a scalarisation loop is currently
    not supported. It could be done, but we have done extensive
    experimentation with loops that handle each element of a scalable
    vector sequentially, but this has never proved beneficial, even when
    using special instructions to efficiently increment the predicate
    vector. I doubt this will be any different for other scalable vector
    architectures, because of the loop control overhead. Also the
    insertion/extraction of elements from a scalable vector is unlikely to
    be as cheap as for fixed-width vectors.

[2] https://docs.google.com/document/d/1UPH2Hzou5RgGT8XfO39OmVXKEibWPfdYLELSaHr3xzo/edit?usp=sharing

_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210611/89e7b799/attachment.html>