[clang] [llvm] [HLSL][DirectX] Implement HLSL `mul` function and DXIL lowering of `llvm.matrix.multiply` (PR #184882)

Farzon Lotfi via cfe-commits cfe-commits at lists.llvm.org
Thu Mar 5 22:00:57 PST 2026


================
@@ -1054,6 +1055,68 @@ Value *CodeGenFunction::EmitHLSLBuiltinExpr(unsigned BuiltinID,
     Value *Mul = Builder.CreateNUWMul(M, A);
     return Builder.CreateNUWAdd(Mul, B);
   }
+  case Builtin::BI__builtin_hlsl_mul: {
+    Value *Op0 = EmitScalarExpr(E->getArg(0));
+    Value *Op1 = EmitScalarExpr(E->getArg(1));
+    QualType QTy0 = E->getArg(0)->getType();
+    QualType QTy1 = E->getArg(1)->getType();
+
+    bool IsVec0 = QTy0->isVectorType();
+    bool IsVec1 = QTy1->isVectorType();
+    bool IsMat0 = QTy0->isConstantMatrixType();
+    bool IsMat1 = QTy1->isConstantMatrixType();
+
+    if (IsVec0 && IsVec1) {
+      // Case 5: vector * vector -> scalar (dot product)
----------------
farzonl wrote:

It is possible there are performance improvements  with higher levels of optimization because clang builtins get ignored by most passes.

 That said codgen will be significantly worse with no optimization because the helper functions will not have been inlined.

https://github.com/llvm/llvm-project/pull/184882


More information about the cfe-commits mailing list