[PATCH] D72841: [RFC] Add support for pragma float_control, to control precision and exception behavior at the source level

Wed Mar 4 13:33:41 PST 2020

mibintc marked 12 inline comments as done.
mibintc added a comment.
Herald added a subscriber: rnkovacs.

some inline replies and comments

================
Comment at: clang/include/clang/AST/Stmt.h:1104
+    static_assert(sizeof(*this) <= 16,
                   "changing bitfields changed sizeof(Stmt)");
     static_assert(sizeof(*this) % alignof(void *) == 0,
----------------
rjmccall wrote:
> What's happening here is exactly what this assertion is supposed to prevent.   If you need more bits in one of these classes (I assume it's `CXXOperatorCallExpr`), you need to either make a field in the actual class or investigate more arcane mechanisms like  trailing storage to reduce the normal impact.  The latter is probably unnecessary for `CXXOperatorCallExpr`.
@rjmccall The reason i changed the assertion is because FPOptions is now wider, so I had to change the assertion.  See line 609 above. Is there something I need to do differently? 

================
Comment at: clang/include/clang/Basic/DiagnosticParseKinds.td:1138
+def err_pragma_float_control_unknown_kind : Error<
+  "unknown kind of pragma float_control">;
 // - #pragma pointers_to_members
----------------
rjmccall wrote:
> Maybe "operation" would be a better user-facing name than "kind"?   Also, this diagnostic is more specific but less helpful than the diagnostic just above.
I got rid of the diagnostic with the unhelpful string and just using the single diagnostic which has full information about how to form the pragma

================
Comment at: clang/include/clang/Basic/LangOptions.def:196
+COMPATIBLE_LANGOPT(AllowRecip        , 1, 0, "Permit Floating Point reciprocal")
+COMPATIBLE_LANGOPT(ApproxFunc        , 1, 0, "Permit Floating Point approximation")

----------------
rjmccall wrote:
> Please align the commas.
> 
> Would it make more sense to just store an `FPOptions` in `LangOptions` instead of breaking all of the bits down separately?
> 
> We may need to reconsider at some point whether any of these are really "compatible" language options.  Headers can contain inline code, and we shouldn't compile that incorrectly just because we reused a module we built under different language settings.  Although... maybe we can figure out a way to store just the ways that an expression's context overrides the default semantics and then merge those semantics into the default set for the translation unit; that would make them actually compatible.  Of course, it would also require more bits in expressions where it matters, and you might need to investigate trailing storage at that point.
I aligned the commas.  I didn't put FPOptions into LangOptions, would you like me to make that change too?  I don't know about trailing storage. I see that term in the code but I didn't see details about what that is/how that works. 

================
Comment at: clang/include/clang/Basic/LangOptions.h:468
+  bool allowReciprocal() const { return allow_reciprocal; }
+  bool approxFunc() const      { return approx_func; }
+
----------------
rjmccall wrote:
> Somewhere in this type, it should be obvious where I can go in order to understand what any of these flags means precisely.  Ideally that would be reinforced by the method names, instead of using non-term-of-art abbreviations like "reassoc".
I put the comments on the field declarations in the private part. I changed the names of the accessor methods to be more descriptive. (Previously I was using the same names as LLVM uses for those fields).

================
Comment at: clang/include/clang/Basic/LangOptions.h:517
+    approx_func = ((I>>13) & 1);
   }

----------------
rjmccall wrote:
> The more conventional method names here would an instance method called something like `getAsOpaqueInt` and then a static method called something like `getFromOpaqueInt`.
I changed the names like you suggested but not using the static method, is this OK?

================
Comment at: clang/lib/CodeGen/CGExprScalar.cpp:4085
     // this should improve codegen just a little.
-    RHS = Visit(E->getRHS());
-    LHS = EmitCheckedLValue(E->getLHS(), CodeGenFunction::TCK_Store);
+    if (E->getType()->isFloatingType()) {
+      //  Preserve the old values, enable FPFeatures for these expressions
----------------
In the previous rendition of this patch, when the Builder.FMF settings were modified at Visit(BinaryExpression), the assign is seen as a binary expression and so the FPFeatures was passed into IRBuilder. I'm not confident this patch is in the right place, I'd really like to put FPFeatures onto the CallExpr node, because if you call a builtin intrinsic function, and the mode is set to float_control(except, on), the call node for the intrinsic doesn't have the FPFeature bits, so it isn't marked as expected. Before I make that change I want @rjmccall to take another look;  If FPFeatures was on CallExpr then I'd remove it here and modify IRBuilder.FMF when visiting CallExpr 

================
Comment at: clang/lib/CodeGen/CGExprScalar.cpp:462
+      return StmtVisitor<ScalarExprEmitter, Value*>::Visit(E);
+    }
     return StmtVisitor<ScalarExprEmitter, Value*>::Visit(E);
----------------
rjmccall wrote:
> You can override `VisitBinOp` and just do this in that case.  But why does it need to be done at this level at all, setting global state on the builder for all emissions, instead of in the leaves where we know we're emitting floating-point operations?  This is adding a lot of overhead in some of the most commonly-exercised code paths in IRGen, but building FP expressions is relatively uncommon.  I would definitely prefer a little bit of repetitive code over burdening the common case this much.  It might also be nice to figure out when this is unnecessary.
> 
> Also, please extract a function to make FastMathFlags from FPOptions; you'll need it elsewhere, e.g. in CGExprComplex.
I removed it from here and pushed this work towards the leaves. I decided that I should put FPFeatures onto UnaryOperator nodes which was left as a FIXME by an earlier author in this area. I added the FastMathFlags function like you suggested but i suppose it needs to be moved out of this file.

================
Comment at: clang/test/CodeGenOpenCL/builtins-r600.cl:5
 // CHECK-LABEL: @test_recipsqrt_ieee_f32
-// CHECK: call float @llvm.r600.recipsqrt.ieee.f32
+// CHECK: call contract float @llvm.r600.recipsqrt.ieee.f32
 void test_recipsqrt_ieee_f32(global float* out, float a)
----------------
OpenCL CompilerInvocation always sets fp_contract to "on"; inside clang I check if either fp_contract==on or fp_contract==fast, that expression is used to set the IRBuilder.FMF.contract bit.

CUDA CompilerInvocation always sets fp_contract to "fast"

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72841/new/

https://reviews.llvm.org/D72841