[PATCH] D151834: Include math-errno with fast-math

Tue Jun 13 16:04:16 PDT 2023

andrew.w.kaylor added inline comments.

================
Comment at: clang/lib/CodeGen/CGBuiltin.cpp:2249

+  // MSVC takes into account math-errno when compiling at -O2 level in
+  // the presence of a '#pragma float_control precise(on/off)'.
----------------
Since we handle "#pragma float_control" for all targets, we should be consistent. I don't think the reference to MSVC here is useful.

I'm not sure why MSVC thinks it needs to call the sqrtf() function at O1, but I don't necessarily think we should follow that lead. If we generate the intrinsic here, the optimizer will make its own decisions about O1 versus O2, etc. There's no reason that errno should be preserved at O1 if it's been disabled with -fno-math-errno or we're enabled fast-math.

================
Comment at: clang/lib/CodeGen/CGBuiltin.cpp:2263
+  }
+  bool ErrnoWithO2OrFastMath =
+      MathErrnoOverrride && (CGM.getCodeGenOpts().OptimizationLevel == 2 ||
----------------
I'm confused by this condition. If MathErrnoOverride is true, don't we need to set errno regardless of the optimization level or fast-math setting?

================
Comment at: clang/lib/CodeGen/CGBuiltin.cpp:2266
+                             CGM.getLangOpts().FastMath);
+  bool OptNoneWithFastMath =
+      CurFuncDecl->hasAttr<OptimizeNoneAttr>() && CGM.getLangOpts().FastMath;
----------------
Why do we need to check FastMath if the function is marked as optnone?

================
Comment at: clang/lib/CodeGen/CGBuiltin.cpp:2284
+
+  if ((FD->hasAttr<ConstAttr>() && EmitIntrinsic) || EmitIntrinsic ||
+       ((ConstWithoutErrnoAndExceptions || ConstWithoutExceptions) &&
----------------
Based on the comment above, if the function call is marked as const, then it will never set errno, so I don't think we need to check EmitIntrinsic in that case.

================
Comment at: clang/test/CodeGen/math-errno.c:28
+// FAST-LABEL: define dso_local nofpclass(nan inf) float @f1
+// FAST: call fast nofpclass(nan inf) float @sqrtf(float noundef nofpclass(nan inf) {{.*}}) #[[ATTR3_FAST:[0-9]+]]
+//
----------------
Shouldn't the precise pragma remove the nofpcalss(nan inf) attributes? The definition of setFPPreciseEnabled looks like it's trying to do that. Maybe we need another patch to handle that?

================
Comment at: clang/test/CodeGen/math-errno.c:40
+// CHECK-LABEL: define dso_local float @f2
+// CHECK: tail call float @llvm.sqrt.f32(float {{.*}})
+
----------------
Again, this may be beyond the scope of your change, but it seems like '#pragma float_control(precise, off) should lead to fast-math flags being set on this call.

================
Comment at: clang/test/CodeGen/math-errno.c:43
+// FAST-LABEL: define dso_local nofpclass(nan inf) float @f2
+// FAST: call fast nofpclass(nan inf) float @sqrtf(float noundef nofpclass(nan inf) {{.*}}) #[[ATTR3_FAST]]
+
----------------
I expected the intrinsic to be called in this case. Do you know why it isn't? I believe it is without your change.

================
Comment at: clang/test/CodeGen/math-errno.c:46
+// NOOPT-LABEL: define dso_local float @f2
+// NOOPT: tail call float @llvm.sqrt.f32(float {{.*}})
+
----------------
I didn't expect to see the intrinsic in this case. If we use the intrinsic, the x86-backend will generate sqrtss instead of the function call, and I think that is wrong.

================
Comment at: clang/test/CodeGen/math-errno.c:56
+// CHECK-LABEL: define dso_local float @f3
+// CHECK: call float @llvm.sqrt.f32(float {{.*}})
+
----------------
Again, I didn't expect the intrinsic with optnone in any of these cases.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D151834/new/

https://reviews.llvm.org/D151834