[clang] [llvm] [mlir] [AMDGPU] add clamp immediate operand to WMMA iu8 intrinsic (PR #171069)
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Sun Dec 14 00:21:59 PST 2025
================
@@ -4504,6 +4511,21 @@ static Value *upgradeARMIntrinsicCall(StringRef Name, CallBase *CI, Function *F,
//
static Value *upgradeAMDGCNIntrinsicCall(StringRef Name, CallBase *CI,
Function *F, IRBuilder<> &Builder) {
+ if (Name.starts_with("wmma.i32.16x16x64.iu8")) {
+ // Legacy WMMA IU8 intrinsic lacked the optional clamp operand. Append
+ // clamp=false for compatibility.
+ if (CI->arg_size() != 7)
+ return nullptr;
+
+ SmallVector<Value *, 8> Args(CI->args().begin(), CI->args().end());
+ Args.push_back(Builder.getFalse());
+
+ Function *NewDecl = Intrinsic::getOrInsertDeclaration(
+ F->getParent(), Intrinsic::amdgcn_wmma_i32_16x16x64_iu8,
+ {CI->getArgOperand(4)->getType(), CI->getArgOperand(1)->getType()});
+ return Builder.CreateCall(NewDecl, Args);
----------------
arsenm wrote:
Do you really need to create a new call, and not mutate the one in place? As it is this is losing callsite attributes and metadata
https://github.com/llvm/llvm-project/pull/171069
More information about the llvm-commits
mailing list