[PATCH] D75580: [llvm][CodeGen][SVE] Implement IR intrinsics for gather prefetch.

Francesco Petrogalli via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Mar 13 08:34:16 PDT 2020


fpetrogalli marked 6 inline comments as done.
fpetrogalli added inline comments.


================
Comment at: llvm/include/llvm/IR/IntrinsicsAArch64.td:1263
+                  llvm_anyvector_ty, // Offsets
+                  LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>, // Predicate
+                  llvm_i32_ty // Prfop
----------------
andwar wrote:
> sdesmalen wrote:
> > fpetrogalli wrote:
> > > sdesmalen wrote:
> > > > I realise this only now, but having the predicate as the third operand is different from the gather loads. Can you make this to be the first operand, rather than the third, to streamline their definition with the gather loads?
> > > Good point. I have redefined the class as 
> > > 
> > > ```
> > > class SVE_gather_prf_scalar_base_vector_offset_scaled
> > >     : Intrinsic<[],
> > >                 [
> > >                   LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>, // Predicate
> > >                   llvm_ptr_ty, // Base address
> > >                   llvm_anyvector_ty, // Offsets
> > >                   llvm_i32_ty // Prfop
> > > 		],
> > >                 [IntrInaccessibleMemOrArgMemOnly, NoCapture<0>, ImmArg<3>]>;
> > > ```
> > > 
> > > and consequently redefined the intrinsics from, say, `declare void @llvm.aarch64.sve.gather.prfb.scaled.uxtw.nx4vi32(i8* %base, <vscale x 4 x i32> %offset, <vscale x 4 x i1> %Pg, i32 %prfop)`, to `declare void @llvm.aarch64.sve.gather.prfb.scaled.uxtw.nx4vi32(<vscale x 4 x i1> %Pg, i8* %base, <vscale x 4 x i32> %offset, i32 %prfop)`, but I get `llc` runtime failures as the following:
> > > 
> > > ```
> > > Wrong types for attribute: inalloca nest noalias nocapture nonnull readnone readonly signext sret zeroext byval dereferenceable(1) dereferenceable_or_null(1)
> > > void (<vscale x 4 x i1>, i8*, <vscale x 4 x i32>, i32)* @llvm.aarch64.sve.gather.prfb.scaled.uxtw.nxv4i32
> > > ```
> > > 
> > > I cannot figure out what I am doing wrong though... Am I right assuming that `llvm_anyvec_ty` is still the overloaded type that needs to be mentioned in the name of the intrinsic in IR? (`nxv4i32` for the example reported here).
> > The pointer is now the second argument, and I suspect you forgot to update `NoCapture<0>` to `NoCapture<1>`.
> > Am I right assuming that llvm_anyvec_ty is still the overloaded type that needs to be mentioned in the name of the intrinsic in IR? (nxv4i32 for the example reported here).
> 
> I think so, but IIUC this (from you example that didn't work):
> 
> ```
> LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>, // Predicate
> ```
> 
> means the following: the **0th operand ** has the same number of elements as the **0th operand**.  Which is why, I think, you hit that problem ^^^.
Thanks! I missed that.


================
Comment at: llvm/include/llvm/IR/IntrinsicsAArch64.td:1263
+                  llvm_anyvector_ty, // Offsets
+                  LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>, // Predicate
+                  llvm_i32_ty // Prfop
----------------
fpetrogalli wrote:
> andwar wrote:
> > sdesmalen wrote:
> > > fpetrogalli wrote:
> > > > sdesmalen wrote:
> > > > > I realise this only now, but having the predicate as the third operand is different from the gather loads. Can you make this to be the first operand, rather than the third, to streamline their definition with the gather loads?
> > > > Good point. I have redefined the class as 
> > > > 
> > > > ```
> > > > class SVE_gather_prf_scalar_base_vector_offset_scaled
> > > >     : Intrinsic<[],
> > > >                 [
> > > >                   LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>, // Predicate
> > > >                   llvm_ptr_ty, // Base address
> > > >                   llvm_anyvector_ty, // Offsets
> > > >                   llvm_i32_ty // Prfop
> > > > 		],
> > > >                 [IntrInaccessibleMemOrArgMemOnly, NoCapture<0>, ImmArg<3>]>;
> > > > ```
> > > > 
> > > > and consequently redefined the intrinsics from, say, `declare void @llvm.aarch64.sve.gather.prfb.scaled.uxtw.nx4vi32(i8* %base, <vscale x 4 x i32> %offset, <vscale x 4 x i1> %Pg, i32 %prfop)`, to `declare void @llvm.aarch64.sve.gather.prfb.scaled.uxtw.nx4vi32(<vscale x 4 x i1> %Pg, i8* %base, <vscale x 4 x i32> %offset, i32 %prfop)`, but I get `llc` runtime failures as the following:
> > > > 
> > > > ```
> > > > Wrong types for attribute: inalloca nest noalias nocapture nonnull readnone readonly signext sret zeroext byval dereferenceable(1) dereferenceable_or_null(1)
> > > > void (<vscale x 4 x i1>, i8*, <vscale x 4 x i32>, i32)* @llvm.aarch64.sve.gather.prfb.scaled.uxtw.nxv4i32
> > > > ```
> > > > 
> > > > I cannot figure out what I am doing wrong though... Am I right assuming that `llvm_anyvec_ty` is still the overloaded type that needs to be mentioned in the name of the intrinsic in IR? (`nxv4i32` for the example reported here).
> > > The pointer is now the second argument, and I suspect you forgot to update `NoCapture<0>` to `NoCapture<1>`.
> > > Am I right assuming that llvm_anyvec_ty is still the overloaded type that needs to be mentioned in the name of the intrinsic in IR? (nxv4i32 for the example reported here).
> > 
> > I think so, but IIUC this (from you example that didn't work):
> > 
> > ```
> > LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>, // Predicate
> > ```
> > 
> > means the following: the **0th operand ** has the same number of elements as the **0th operand**.  Which is why, I think, you hit that problem ^^^.
> Thanks! I missed that.
Just a clarification: the 0-th operand refers to the 0-th overloaded operand, which is still `llvm_anyvector_ty`.

@sdesmalen was right, it's the `NoCapture` that was causing problems.

All good now :).

Thank you both.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75580/new/

https://reviews.llvm.org/D75580





More information about the llvm-commits mailing list