[PATCH] D87947: [AMDGPU] Make ds fp atomics overloadable

Yaxun Liu via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Sep 23 11:37:13 PDT 2020


yaxunl accepted this revision.
yaxunl added inline comments.
This revision is now accepted and ready to land.


================
Comment at: clang/lib/CodeGen/CGBuiltin.cpp:14772
+    llvm::Type *PTy = FTy->getParamType(0);
+    Src0 = Builder.CreatePointerBitCastOrAddrSpaceCast(Src0, PTy);
+    return Builder.CreateCall(F, { Src0, Src1, Src2, Src3, Src4 });
----------------
rampitec wrote:
> rampitec wrote:
> > arsenm wrote:
> > > rampitec wrote:
> > > > arsenm wrote:
> > > > > I don't think you need a cast here (at least an addrspacecast)
> > > > If removed builtins-amdgcn.cu fails. It is CUDA with LDS pointer passed as flat. I.e. it comes as cast from addrspace(3) to flat. Generic builtin handling below in this file does the same.
> > > I thought these casts would be present in the AST?
> > It comes as a flat pointer. I am just replicating what generic code does.
> Check the code around the line 4440 in the same file. It does even more than that.
There was a TargetInfo hook getCUDABuiltinAddressSpace which was introduced by Matt. The default implementation maps any address space to default address space 0. For some reason, it was not implemented as target specific to map the address space specified by builtin def to real ones. As a result, all builtin functions have generic pointer parameter for CUDA. Therefore the cast is needed here when calling the intrinsic.

We could consider fix that. For this patch, I think we still need the cast.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D87947/new/

https://reviews.llvm.org/D87947



More information about the llvm-commits mailing list