[PATCH] D123638: [SLP][AArch64] Implement lookahead operand reordering score of splat loads for AArch64

Vasileios Porpodas via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Apr 21 06:25:38 PDT 2022


vporpo added inline comments.


================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:2616
+      if (IsLoad) {
+        assert(isLegalBroadcastLoad(Tp->getElementType(),
+                                    LT.second.getVectorElementCount()) &&
----------------
dmgreen wrote:
> vporpo wrote:
> > dmgreen wrote:
> > > This assert looks like it will fire a lot of the time. Should we use `IsLoad && isLegalBroadcastLoad(...)`?
> > > 
> > > That might make this testable from the cost model too, Even if it's slightly unorthodox to use a vector load for such cases.
> > It is already guarded by `if (IsLoad)` so I guess `IsLoad && ` is not needed.
> > 
> > > That might make this testable from the cost model too, Even if it's slightly unorthodox to use a vector load for such cases.
> > I am not sure I understand. Is this about the assertion?
> I meant remove the assert and turn it into a condition. This function might be checking Types which we do not consider to be legal. https://godbolt.org/z/dv56WMaP7 has an example (apparently :) ).
> 
> We could presumably have a test in llvm/test/Analysis/CostModel/AArch64 that tests `load <vector>, splat-shuffle`, and it will trigger this and be treated as free?
Oops, yes this is totally broken, sorry about that. I had just copied this part of the code from X86 TTI where we have a cost table that checks the types but I skipped the table. This assertion is meant to check that the table and `isLegalBroadcastLoad` are in sync. 
To fix this I can either add a cost table like this:
```
  static const CostTblEntry NeonBroadcastLoadTbl[] = { 
     {TTI::SK_Broadcast, MVT::v8i8, 0 },
     {TTI::SK_Broadcast, MVT::v16i8, 0 },
     {TTI::SK_Broadcast, MVT::v4i16, 0 },
     {TTI::SK_Broadcast, MVT::v8i16, 0 },
     {TTI::SK_Broadcast, MVT::v2i32, 0 },
     {TTI::SK_Broadcast, MVT::v4i32, 0 },
     {TTI::SK_Broadcast, MVT::v2f64, 0},
     {TTI::SK_Broadcast, MVT::v4f32, 0 },
  }; 
```
using a similar logic as in X86TargetTransformInfo.cpp:1558, or rely on an `if (isLegalBroadcastLoad())`. 
I think adding the table makes it a bit more explicit, and I would prefer it. What do you think? 

Btw would you happen to know if a `v2f32` broadcast is handled by `ld1r` in neon? I think this is the only 64-bit entry missing from the table.


================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:2703
+    if (Kind == TTI::SK_Broadcast) {
+      bool IsLoad = !Args.empty() && llvm::all_of(Args, [](const Value *V) {
+        return isa<LoadInst>(V);
----------------
dmgreen wrote:
> dmgreen wrote:
> > vporpo wrote:
> > > dmgreen wrote:
> > > > I would expect it to have one Arg for a splat.
> > > > 
> > > > I guess this would not match the canonical representation for a splat load too, except from where it is being created from scalar code. Not sure what to do about that, but it seems unfortunate that the cost will go up after it has been vectorized.
> > > > ```
> > > >   %l = load i8, i8 *%p
> > > >   %i = insertelement <16 x i8> poison, i8 %l, i32 0
> > > >   %s = shufflevector <16 x i8> %i, <16 x i8> poison, <16 x i32> zeroinitializer
> > > > ```
> > > Yes, I think this won't be matched.  `Args` is meant to be used for scalar code, but that's a great point, we could improve this in a future patch.
> > Three instructions patterns are tough to cost-model nicely is llvm at the moment. It needs too much code to check down through uses and up through operands.
> > 
> > The first part of my comment was meaning that Args should only have 1 element for a splat.
> > ```
> >  IsLoad = Args.size() == 1 && isa<LoadInstr>(Args[0]);
> > ```
> > Do we expect it get called when there are multiple loads too?
> Do we need to check more than 1 arg?
Passing the whole vector seems a bit more natural from the SLP point of view. But yes, I agree, a single argument when `Kind == TTI::SK_Broadcast` would probably make more sense from the TTI point of view. This needs changes in the x86 side too, so I will update it in a separate patch.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D123638/new/

https://reviews.llvm.org/D123638



More information about the llvm-commits mailing list