[PATCH] D129690: [LLVM][AMDGPU] Specialize 32-bit atomic fadd instruction for generic address space
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Jul 13 13:52:58 PDT 2022
arsenm added a comment.
I would expect to have a test in test/Transforms/AtomicExpand/AMDGPU like the others there
================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:12967
+ // atomicrmw.private:
+ // %loaded.private = load atomic float* %addr
+ // %new.val = fadd %loaded_private, %val
----------------
Can cast to private and do a non-atomic load
================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:12969
+ // %new.val = fadd %loaded_private, %val
+ // store float %new_val, float* %addr
+ // br label %atomicrmw.phi
----------------
Same for the store
================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:12973-12974
+ // %cast1 = addrspacecast float* %addr, float addrspace(1)*
+ // %loaded.global = __builtin_amdgcn_global_atomic_fadd_f32(
+ // float addrspace(1)* %cast1, float %val)
+ // br label %atomicrmw.phi
----------------
This is ignoring some of the edge case behavior treatment for the atomic instructions. I would have to look up the details again
================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:13003
+ Value *Val = AI->getValOperand();
+ assert(Addr->getType()->isPointerTy());
+ PointerType *PtrTy = cast<PointerType>(Addr->getType());
----------------
assert is redundant with the cast<>
================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:13009
+ Builder.SetInsertPoint(BB);
+ Value *AddrInt8Ptr = Builder.CreatePointerCast(Addr, Builder.getInt8PtrTy());
+ Builder.CreateBr(StartBB);
----------------
this doesn't look opaque pointer friendly? CreatePointerCast is heavier than you need anyway
================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:13014
+ FunctionCallee AddressShared = M->getOrInsertFunction(
+ "llvm.amdgcn.is.shared", Builder.getInt1Ty(), Builder.getInt8PtrTy());
+ Value *IsShared = Builder.CreateCall(AddressShared, {AddrInt8Ptr});
----------------
Should use getIntrinsic with the enum, not refer to the intrinsic by name (or CreateIntrinsic)
================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:13022-13023
+ PointerType::getWithSamePointeeType(PtrTy, AMDGPUAS::LOCAL_ADDRESS));
+ Constant *ZeroI32 =
+ ConstantInt::getIntegerValue(Type::getInt32Ty(Ctx), APInt(32, 0, true));
+ Constant *ZeroI1 =
----------------
getNullValue
================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:13024-13025
+ ConstantInt::getIntegerValue(Type::getInt32Ty(Ctx), APInt(32, 0, true));
+ Constant *ZeroI1 =
+ ConstantInt::getIntegerValue(Type::getInt1Ty(Ctx), APInt(1, 0));
+ Value *LoadedShared = Builder.CreateIntrinsic(
----------------
getFalse
================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:13033
+ FunctionCallee AddressPrivate = M->getOrInsertFunction(
+ "llvm.amdgcn.is.private", Builder.getInt1Ty(), Builder.getInt8PtrTy());
+ Value *IsPrivate = Builder.CreateCall(AddressPrivate, {AddrInt8Ptr});
----------------
Ditto
================
Comment at: llvm/test/CodeGen/AMDGPU/atomicrmw-expand.ll:1
+; RUN: llc -march=amdgcn -mcpu=gfx908 -verify-machineinstrs < %s | FileCheck %s
+
----------------
Should also make sure to cover gfx908 and 90a
================
Comment at: llvm/test/CodeGen/AMDGPU/atomicrmw-expand.ll:4-8
+ ; CHECK: global_atomic_add_f32
+ ; CHECK: flat_load_dword
+ ; CHECK: v_add_f32_e32
+ ; CHECK: flat_store_dword
+ ; CHECK: ds_add_f32
----------------
This doesn't demonstrate any of the looping structure
================
Comment at: llvm/test/CodeGen/AMDGPU/atomicrmw-expand.ll:16
+
+ attributes #0 = { alwaysinline mustprogress nofree norecurse nounwind "amdgpu-unsafe-fp-atomics"="true" "frame-pointer"="all" "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="gfx908" "target-features"="+16-bit-insts,+ci-insts,+dl-insts,+dot1-insts,+dot2-insts,+dot3-insts,+dot4-insts,+dot5-insts,+dot6-insts,+dot7-insts,+dpp,+flat-address-space,+gfx8-insts,+gfx9-insts,+mai-insts,+s-memrealtime,+s-memtime-inst" }
----------------
Don't need most of these attributes
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D129690/new/
https://reviews.llvm.org/D129690
More information about the llvm-commits
mailing list