[PATCH] D75580: [llvm][CodeGen][SVE] Implement IR intrinsics for gather prefetch.

Andrzej Warzynski via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Mar 13 04:24:36 PDT 2020


andwar added a comment.

Thanks you for addressing my comments @fpetrogalli !

A one final nice-to-have (not a blocker!). Could this snippet from `legalizeSVEGatherPrefetchOffsVec` be extracted into a separate function:

  // Not an unpacked vector, bail out.
  if (Offset.getValueType().getSimpleVT().SimpleTy != MVT::nxv2i32)
    return SDValue();
  
  // Extend the unpacked offset vector to 64-bit lanes.
  SDLoc DL(N);
  Offset = DAG.getNode(ISD::ANY_EXTEND, DL, MVT::nxv2i64, Offset);

and re-used by `performGatherLoadCombine` and `performScatterStoreCombine`? We can do it in a separate patch too.



================
Comment at: llvm/include/llvm/IR/IntrinsicsAArch64.td:1263
+                  llvm_anyvector_ty, // Offsets
+                  LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>, // Predicate
+                  llvm_i32_ty // Prfop
----------------
sdesmalen wrote:
> fpetrogalli wrote:
> > sdesmalen wrote:
> > > I realise this only now, but having the predicate as the third operand is different from the gather loads. Can you make this to be the first operand, rather than the third, to streamline their definition with the gather loads?
> > Good point. I have redefined the class as 
> > 
> > ```
> > class SVE_gather_prf_scalar_base_vector_offset_scaled
> >     : Intrinsic<[],
> >                 [
> >                   LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>, // Predicate
> >                   llvm_ptr_ty, // Base address
> >                   llvm_anyvector_ty, // Offsets
> >                   llvm_i32_ty // Prfop
> > 		],
> >                 [IntrInaccessibleMemOrArgMemOnly, NoCapture<0>, ImmArg<3>]>;
> > ```
> > 
> > and consequently redefined the intrinsics from, say, `declare void @llvm.aarch64.sve.gather.prfb.scaled.uxtw.nx4vi32(i8* %base, <vscale x 4 x i32> %offset, <vscale x 4 x i1> %Pg, i32 %prfop)`, to `declare void @llvm.aarch64.sve.gather.prfb.scaled.uxtw.nx4vi32(<vscale x 4 x i1> %Pg, i8* %base, <vscale x 4 x i32> %offset, i32 %prfop)`, but I get `llc` runtime failures as the following:
> > 
> > ```
> > Wrong types for attribute: inalloca nest noalias nocapture nonnull readnone readonly signext sret zeroext byval dereferenceable(1) dereferenceable_or_null(1)
> > void (<vscale x 4 x i1>, i8*, <vscale x 4 x i32>, i32)* @llvm.aarch64.sve.gather.prfb.scaled.uxtw.nxv4i32
> > ```
> > 
> > I cannot figure out what I am doing wrong though... Am I right assuming that `llvm_anyvec_ty` is still the overloaded type that needs to be mentioned in the name of the intrinsic in IR? (`nxv4i32` for the example reported here).
> The pointer is now the second argument, and I suspect you forgot to update `NoCapture<0>` to `NoCapture<1>`.
> Am I right assuming that llvm_anyvec_ty is still the overloaded type that needs to be mentioned in the name of the intrinsic in IR? (nxv4i32 for the example reported here).

I think so, but IIUC this (from you example that didn't work):

```
LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>, // Predicate
```

means the following: the **0th operand ** has the same number of elements as the **0th operand**.  Which is why, I think, you hit that problem ^^^.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75580/new/

https://reviews.llvm.org/D75580





More information about the llvm-commits mailing list