[clang] [AMDGPU] const-fold imm operands of amdgcn_update_dpp intrinsic (PR #71139)

Thu Nov 2 23:22:38 PDT 2023

================
@@ -17632,8 +17632,20 @@ Value *CodeGenFunction::EmitAMDGPUBuiltinExpr(unsigned BuiltinID,
   case AMDGPU::BI__builtin_amdgcn_mov_dpp:
   case AMDGPU::BI__builtin_amdgcn_update_dpp: {
     llvm::SmallVector<llvm::Value *, 6> Args;
-    for (unsigned I = 0; I != E->getNumArgs(); ++I)
-      Args.push_back(EmitScalarExpr(E->getArg(I)));
+    for (unsigned I = 0; I != E->getNumArgs(); ++I) {
+      llvm::Value *Arg = EmitScalarExpr(E->getArg(I));
+      // Except first two input operands, all other are imm operands for dpp
+      // intrinsic.
+      if (llvm::is_contained(std::initializer_list<unsigned>{2, 3, 4, 5}, I)) {
----------------
arsenm wrote:

You don't need to hard code this, referring to the generic builtin handling here:
https://github.com/llvm/llvm-project/blob/5f5f82af966e6edcc72df02b36fb54401ab76266/clang/lib/CodeGen/CGBuiltin.cpp#L5716

You can just lookup which arguments require the immediate. We don't actually have unique needs here, the only reason we probably custom emit this is to emit the type to mangle the builtin. Is there a way to let the generic builtin code deal with the argument list?


https://github.com/llvm/llvm-project/pull/71139