[llvm] [NVPTX] designate fabs and fneg as free (PR #121513)
Alex MacLean via llvm-commits
llvm-commits at lists.llvm.org
Tue Jan 7 15:28:40 PST 2025
================
@@ -261,6 +261,9 @@ class NVPTXTargetLowering : public TargetLowering {
return true;
}
+ bool isFAbsFree(EVT VT) const override { return true; }
+ bool isFNegFree(EVT VT) const override { return true; }
----------------
AlexMaclean wrote:
> My general rule of thumb is that it's likely beneficial to use specific PTX instruction, if there's a matching h/w instruction, but let LLVM handle it if it turns into generic code that LLVM can do itself.
I'm not sure this is the right rule of thumb. I tend to think we should use the most expressive/specific ptx instruction even if we suspect it may just be syntactic sugar. `ptxas` may have specific heuristics/peepholes which apply very unevenly and are dependent on which instruction is used. Even if in theory doing the expansion in LLVM should not matter, in many cases it can substantially impact perf.
> In this case I'm not convinced that folding some cases into "FADD" with a negated argument counts
This particular example is simple and contrived and intended only to highlight the possibility that sometimes `ptxas` may be able to fold floating point abs and neg operands into a single instruction. There are many other similar cases. While this isn't true in all cases, it can happen enough that I think it makes sense to treat these operations as free.
> There's also a question of whether the "free" applies to all types equally. E.g. for bf16x2 the sign folding no longer happens: https://cuda.godbolt.org/z/e15cG6jdM
Yes, it may not apply to all types equally, still I think it is simplest and clearest to prefer fabs and fneg in all cases due to readability and the possibility of unknown `ptxas` optimizations.
https://github.com/llvm/llvm-project/pull/121513
More information about the llvm-commits
mailing list