[PATCH] D149938: [AMDGPU][InferAddressSpaces] Only rewrite address-spaces that can be trivially casted to flat for llvm.amdgcn.flat.atomic.{fadd,fmax,fmin} (2/2)

Juan Manuel Martinez CaamaƱo via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed May 10 03:11:36 PDT 2023


jmmartinez marked 3 inline comments as done.
jmmartinez added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp:1088-1089
+        static_cast<const GCNTargetMachine &>(getTLI()->getTargetMachine());
+    if (!TM.isNoopAddrSpaceCast(OldAS, NewAS))
+      return nullptr;
     Module *M = II->getParent()->getParent()->getParent();
----------------
arsenm wrote:
> But the not-noop address space casts are the cases we're interested in handling. Do you mean to check for valid casts?
The problem is that currently we cannot lower this builtin for other address spaces other than flat.

I agree that the rewrite of this intrinsic loses almost all interest for this intrinsics. Maybe we shouldn't even rewrite them at all, which would simplify the code.

Handling CONSTANT_ADDRESS_32BIT gets awkward since it's not handled in isFlatGlobalAddrSpace. I modified isFlatGlobalAddrSpace to take it into account but I've got several fails in other tests (I haven't looked at the details yet).


================
Comment at: llvm/test/Transforms/InferAddressSpaces/AMDGPU/flat-fadd-fmin-fmax-intrinsics.ll:4
+
+declare float @llvm.amdgcn.flat.atomic.fadd.f32.p0.f32(ptr %ptr, float %data)
+declare float @llvm.amdgcn.flat.atomic.fmax.f32.p0.f32(ptr %ptr, float %data)
----------------
arsenm wrote:
> Can you precommit the test to show the diff?
I've added https://reviews.llvm.org/D150259 as a parent patch.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D149938/new/

https://reviews.llvm.org/D149938



More information about the llvm-commits mailing list