[PATCH] D141240: [SVE][Builtins] Add metadata to intrinsic calls for builtins that don't define the result of inactive lanes.

Wed Jan 11 17:30:17 PST 2023

paulwalker-arm added a comment.

In D141240#4035438 <https://reviews.llvm.org/D141240#4035438>, @sdesmalen wrote:

> Using metadata seems sensible, but did you also identify any downsides? I could imagine that we'd need to manually propagate metadata to any nodes after we do a combine (which can't be blindly copied?), e.g. add + mul -> mla, this new intrinsic would also need the metadata.

I don't really see manually propagation as a downside because it's not functionally required but rather advantageous to maximise optimisation opportunities.  The downside is the opposite in that any transformation that wants to rely on the inactive lanes being defined as before this patch will now need to check for the presence of (or rather lack of) the new metadata before blindly reusing the result of an existing SVE intrinsic call.  The transformation can still reuse the call it must just first discard the metadata.

> For intrinsics that don't have a directly corresponding (unpredicated) LLVM IR instruction, is there still a way to use this information in SelectionDAG?

Truth be told I'm not entire sure how this will play out.  I'm not sure whether it's better to use the information within the IR as I'm doing in this patch or whether this should be used solely when lowering IR to DAG.  So it's really an experiment to see what sticks while proving a route to fix some of the issues we've already observed with how we represent the X forms.

Predicated->unpredicted aside another use for encoding undefiness is that it helps with things the FMAs where we can use FMAD if that better suits register allocation much like we do for stock IR.

>> the select instruction itself has strict rules relating to poison that hampered the intent of this change
>
> For my understanding, can you elaborate what these strict rules regarding poison are that hamper such a change, and what it was that you tried?

The LangRef states the transformation "select P, A, undef ==> A" is only valid when you can prove the inactive lanes of "A" do not contain poison. I'm unsure if this is a true blocker or a mere inconvenience because to maintain the maximum amount of information we likely don't want to remove the selects anyway.  I went down this path by creating an SVE undef intrinsic, which nothing knows about and thus will be left alone.  The problem is that it massively polluted the IR and I was worried it'll make it harder to spot/implement the typical combines. For sure the existing combines will need to be changed because they'll not know to look through the new selects.

There is the option to change the clang builtin lowering to provide finer control over which builtins emit these selects, but that just means more changes (updates to existing instcombines) each time we decide a builtin is worth the extra select.

I'll keep experimenting but as I mention within the in code comment, the likely best solution is to have dedicate intrinsics with this being the least intrusive hack.

Perhaps the key word there is "hack" :) I'll investigate the dedicate intrinsics route because perhaps we only require a handful to get the majority of the benefit.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141240/new/

https://reviews.llvm.org/D141240