[llvm] GlobalISel needs fdiv 1 / sqrt(x) to rsq combine (PR #78673)

Pierre van Houtryve via llvm-commits llvm-commits at lists.llvm.org
Sun Jan 28 22:18:38 PST 2024


================
@@ -334,6 +336,19 @@ bool AMDGPUPostLegalizerCombinerImpl::matchRcpSqrtToRsq(
   return false;
 }
 
+void AMDGPUPostLegalizerCombinerImpl::applyOneFDivSqrtToRsq(
+    MachineInstr &MI, const Register &X) const {
+  // B.setInstrAndDebugLoc(MI);
+
+  Register Dst = MI.getOperand(0).getReg();
+
+  B.buildIntrinsic(Intrinsic::amdgcn_rsq, ArrayRef<Register>({Dst}))
+      .addUse(X)
+      .setMIFlags(MI.getFlags());
----------------
Pierre-vh wrote:

Copying the flag sounds good then :)

Flags are optimization hints to enable assumptions, and if those assumptions are broken - or the flag is added when an assumption isn't verified - you can end up with very weird results https://llvm.org/docs/LangRef.html#fastmath

e.g.: When you have `contract`, you're telling the compiler "in this case, you can assume the result of a fused operation will be the same as the result of two distinct operation". This matters because a distinct multiply + add may round numbers twice - once after each operation, but a fused multiply-add (fma) may only round once.  If the compiler knows the inputs of the operations don't care, then it can create a fma, if it doesn't know that, it needs to emit two instructions instead to ensure the result is correct


https://github.com/llvm/llvm-project/pull/78673


More information about the llvm-commits mailing list