[clang] [llvm] [NVPTX] Add intrinsics for redux.sync f32 instructions (PR #126664)
Durgadoss R via cfe-commits
cfe-commits at lists.llvm.org
Tue Feb 11 00:53:17 PST 2025
================
@@ -328,6 +328,24 @@ defm REDUX_SYNC_AND : REDUX_SYNC<"and", "b32", int_nvvm_redux_sync_and>;
defm REDUX_SYNC_XOR : REDUX_SYNC<"xor", "b32", int_nvvm_redux_sync_xor>;
defm REDUX_SYNC_OR : REDUX_SYNC<"or", "b32", int_nvvm_redux_sync_or>;
+multiclass REDUX_SYNC_F<string BinOp, string ABS, string NAN, Intrinsic Intrin> {
+ def : NVPTXInst<(outs Float32Regs:$dst),
+ (ins Float32Regs:$src, Int32Regs:$mask),
+ "redux.sync." # !tolower(BinOp) # !subst("_", ".", ABS) # !subst("_", ".", NAN) # ".f32 $dst, $src, $mask;",
----------------
durga4github wrote:
we do not need tolower
https://github.com/llvm/llvm-project/pull/126664
More information about the cfe-commits
mailing list