[clang] [llvm] [AMDGPU] Add global_load_tr for GFX12 (PR #77772)
Matt Arsenault via cfe-commits
cfe-commits at lists.llvm.org
Thu Jan 11 19:18:17 PST 2024
================
@@ -18178,6 +18178,51 @@ Value *CodeGenFunction::EmitAMDGPUBuiltinExpr(unsigned BuiltinID,
llvm::Function *F = CGM.getIntrinsic(IID, {ArgTy});
return Builder.CreateCall(F, {Addr, Val, ZeroI32, ZeroI32, ZeroI1});
}
+ case AMDGPU::BI__builtin_amdgcn_global_load_tr_b64_v2i32:
+ case AMDGPU::BI__builtin_amdgcn_global_load_tr_b64_i32:
+ case AMDGPU::BI__builtin_amdgcn_global_load_tr_b128_v8i16:
+ case AMDGPU::BI__builtin_amdgcn_global_load_tr_b128_v8f16:
+ case AMDGPU::BI__builtin_amdgcn_global_load_tr_b128_v4i16:
+ case AMDGPU::BI__builtin_amdgcn_global_load_tr_b128_v4f16: {
+
+ Intrinsic::ID IID;
+ llvm::Type *ArgTy;
+ switch (BuiltinID) {
+ case AMDGPU::BI__builtin_amdgcn_global_load_tr_b64_v2i32:
+ ArgTy = llvm::FixedVectorType::get(
+ llvm::Type::getInt32Ty(getLLVMContext()), 2);
+ IID = Intrinsic::amdgcn_global_load_tr_b64;
----------------
arsenm wrote:
Why is there a different intrinsic for each type? Why not just add one intrinsic, overloaded on the pointer type and value type?
https://github.com/llvm/llvm-project/pull/77772
More information about the cfe-commits
mailing list