[clang] [Clang] VectorExprEvaluator::VisitCallExpr / InterpretBuiltin - add MMX/SSE/AVX/AVX512 PMULHRSW intrinsics to be used in constexpr (PR #160636)
Simon Pilgrim via cfe-commits
cfe-commits at lists.llvm.org
Mon Sep 29 02:59:13 PDT 2025
================
@@ -3423,6 +3423,18 @@ bool InterpretBuiltin(InterpState &S, CodePtr OpPC, const CallExpr *Call,
return LHS.isSigned() ? LHS.ssub_sat(RHS) : LHS.usub_sat(RHS);
});
+ case clang::X86::BI__builtin_ia32_pmulhrsw128:
+ case clang::X86::BI__builtin_ia32_pmulhrsw256:
+ case clang::X86::BI__builtin_ia32_pmulhrsw512:
+ return interp__builtin_elementwise_int_binop(
+ S, OpPC, Call, [](const APSInt &LHS, const APSInt &RHS) {
+ unsigned Width = LHS.getBitWidth();
+ APInt Mul = llvm::APIntOps::mulhs(LHS, RHS);
+ Mul = Mul.relativeLShr(14);
+ Mul = Mul + APInt(Width, 1, true);
+ return Mul.relativeLShr(1);
+ });
----------------
RKSimon wrote:
It'd be much better if you kept to an expansion closer to the Intel Intrinsics Guide description:
```
tmp[31:0] := ((SignExtend32(a[i+15:i]) * SignExtend32(b[i+15:i])) >> 14) + 1
dst[i+15:i] := tmp[16:1]
```
which would be something like:
```
return (mulsExtended(LHS, RHS).ashr(14) + 1).extractBits(16, 1);
```
https://github.com/llvm/llvm-project/pull/160636
More information about the cfe-commits
mailing list