<div dir="ltr"><div dir="ltr"><div dir="ltr" class="gmail_attr">On Fri, Mar 13, 2020 at 3:17 AM John McCall <<a href="mailto:rjmccall@gmail.com">rjmccall@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On Thu, Mar 12, 2020 at 3:51 PM Blower, Melanie I<br>
<<a href="mailto:melanie.blower@intel.com" target="_blank">melanie.blower@intel.com</a>> wrote:<br>
> Oh I like Option 4. I think I understand what you mean. Clang would create and insert a “floating point region statement” into the code stream whenever we need to modify the floating point state.<br>
<br>
Or just some sort of PragmaStmt within a block, and some sort of<br>
function-scope attribute to capture global pragma state.</blockquote><div><br></div><div>Different blocks inside the same function may have different FP environment, so FP attributes cannot have function-scope, we should associate them with CompoundStmt. We might have a bit in the state of CompoundStmt to indicate that FP state in it is modified. The relevant FP attributes could be kept in some node inside the compound statement. PragmaStmt could a good choice for the node name, as all state modification are made by pragmas so far. It also could be used to keep information of other pragmas, if that matters.</div><div><br></div><div>However we need something similar for initializers and default arguments at least. Statement cannot be used here, so we need also introduce PragmaDecl. This node may be instead of PragmaStmt, if enveloped with DeclStmt. </div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> However, it would be much<br>
harder for clients like the constant evaluator that might be sensitive<br>
to context-sensitive evaluation settings, to the extent that some of<br>
these pragmas affect formal evaluation rules (by e.g. specifying<br>
static but non-standard rounding) rather than just allowing looser<br>
optimization. </blockquote><div><br></div><div>I hope keeping a stack of state pragmas in Sema may alleviate this problem. It may be done by keeping only the top pragma in Sema and saving previous pragma by using RAII objects, as we already do in many cases. CodeGen also need to maintain such stack. Some function, mainly static, will require additional argument, like Sema, as some AST nodes may do not have all information about the operation they represent. The added complexity seems modest.<br></div><div><br></div><div><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature">Thanks,<br>--Serge<br></div></div><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Mar 13, 2020 at 3:17 AM John McCall <<a href="mailto:rjmccall@gmail.com">rjmccall@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On Thu, Mar 12, 2020 at 3:51 PM Blower, Melanie I<br>
<<a href="mailto:melanie.blower@intel.com" target="_blank">melanie.blower@intel.com</a>> wrote:<br>
> Oh I like Option 4. I think I understand what you mean. Clang would create and insert a “floating point region statement” into the code stream whenever we need to modify the floating point state.<br>
<br>
Or just some sort of PragmaStmt within a block, and some sort of<br>
function-scope attribute to capture global pragma state. That would<br>
certainly be easier for IR generation. However, it would be much<br>
harder for clients like the constant evaluator that might be sensitive<br>
to context-sensitive evaluation settings, to the extent that some of<br>
these pragmas affect formal evaluation rules (by e.g. specifying<br>
static but non-standard rounding) rather than just allowing looser<br>
optimization.<br>
<br>
I want to note that using as few bits as possible is only important to<br>
the extent that you need these bits in common representations. The<br>
narrower the case in which extra storage is required, the more extra<br>
storage it's okay to use. So if you need storage in every binary<br>
operator, you need to be very sparing; if you only need storage in<br>
binary operators on FP types, it's fine to use a little more memory;<br>
if you only need storage in binary operators on FP types when the<br>
pragmas are actually being used, your budget is quite expansive; and<br>
if you can avoid per-expression overhead entirely by doing something<br>
at the scope level — especially if it's only needed when there's an<br>
explicit pragma in that scope — your budget is basically unlimited.<br>
<br>
John.<br>
<br>
<br>
<br>
<br>
><br>
><br>
><br>
> Globally there is the floating point state that comes from the command line options. Any pragma that modifies the floating point state must occur at the start of a block, this would get encoded into FloatingPointRegionStmt. Any blocks nested within this block takes the settings of the outer block. The floating point state reverts to the surrounding state when the block is exited. Expression nodes no longer need to carry around FPOptions.<br>
><br>
><br>
><br>
> float f(float a, float b) {<br>
><br>
> #pragma float_control(except, off) // the floating point environment from the command line is merged with the settings from the pragma and inserted at the beginning of the compound statement<br>
><br>
> return a*b + 3;<br>
><br>
> }<br>
><br>
><br>
><br>
> Using Trailing storage on BinaryOperator is hard because CompoundAssignment derives from BinaryOperator and Trailing storage demands that BinaryOperator be finalized.<br>
><br>
><br>
><br>
> Also I got an email a while ago – I think that floating point state also affects complex floating point literals. Intrinsics are called for these literals and so we’d need the information during codegen for those expressions too.<br>
><br>
><br>
><br>
> From: Serge Pavlov <<a href="mailto:sepavloff@gmail.com" target="_blank">sepavloff@gmail.com</a>><br>
> Sent: Thursday, March 12, 2020 2:24 AM<br>
> To: Clang Dev <<a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a>><br>
> Cc: <a href="mailto:hokein@google.com" target="_blank">hokein@google.com</a>; Sam McCall <<a href="mailto:sammccall@google.com" target="_blank">sammccall@google.com</a>>; Blower, Melanie I <<a href="mailto:melanie.blower@intel.com" target="_blank">melanie.blower@intel.com</a>>; Theko Lekena <<a href="mailto:mlekena@skidmore.edu" target="_blank">mlekena@skidmore.edu</a>>; Nicolas Lesser <<a href="mailto:blitzrakete@gmail.com" target="_blank">blitzrakete@gmail.com</a>>; Han Shen <<a href="mailto:shenhan@google.com" target="_blank">shenhan@google.com</a>>; Robinson, Paul <<a href="mailto:paul.robinson@sony.com" target="_blank">paul.robinson@sony.com</a>><br>
> Subject: Design of FPOptions<br>
><br>
><br>
><br>
> Hi all,<br>
><br>
><br>
><br>
> There are development efforts aimed at reducing space occupied by the field FPFeatures (<a href="https://reviews.llvm.org/D72841" rel="noreferrer" target="_blank">https://reviews.llvm.org/D72841</a>). The expected result is reduction of the FPFeatures size from 8 bits to 7, so it is really a battle for bit. I have concern that such work can impact readability and maintainability of the code and would like to discuss alternative ways.<br>
><br>
> Background<br>
><br>
> There are two main things that form background of the problem.<br>
><br>
> First, operations that involve floating point arguments need to specify floating point environment. The latter is comprised of bits of hardware state registers (rounding direction, denormals behavior, exception mask) and options and hints provided by user to compiler (whether NANs may occur, shall sign of zero be conserved, may exceptions be ignored). The list of relevant options was discussed in <a href="http://lists.llvm.org/pipermail/llvm-dev/2020-January/138652.html" rel="noreferrer" target="_blank">http://lists.llvm.org/pipermail/llvm-dev/2020-January/138652.html</a>. It is convenient to collect all aspects of the FP environment into one object, this is what class FPOptions does. Now this class contains only few attributes but in <a href="https://reviews.llvm.org/D72841" rel="noreferrer" target="_blank">https://reviews.llvm.org/D72841</a> an attempt was made to extend it in right direction.<br>
><br>
> Second, the Clang internal representation (AST) was implemented to be as compact as possible. The dark side of this principle is inconvenient class organization and diffuse class boundaries, - derived classes keep their states in parent class fields, which are fixed in size. FP environment now is represented by a field in the class BinaryOperatorBitfields and is only 8 bits in size.<br>
><br>
> The problem<br>
><br>
> 8 bits is too few to keep all FP related state and options. So in <a href="https://reviews.llvm.org/D72841" rel="noreferrer" target="_blank">https://reviews.llvm.org/D72841</a> FPOptions was moved out of BinaryOperatorBitfields and made a field in classes UnaryOperator, BinaryOperator, CallExpr and CXXOperatorCallExpr. It is a step in right direction as, for example, now UnaryOperator does not have FPOption.<br>
><br>
> This change however resulted in that every instance of BinaryOperator, CallExpr and other now increase in size. To reduce the space refactoring of <a href="https://reviews.llvm.org/D72841" rel="noreferrer" target="_blank">https://reviews.llvm.org/D72841</a> was started.<br>
><br>
> Current support of FP environment in clang is weak, ongoing development in this direction will eventually increase size required for FP state and options and increase number of AST nodes that require knowledge of FP environment (these could be FloatingLiteral, InitListExpr, CXXDefaultArgExpr and some other). We must elaborate solution that allow future extensions without massive changes.<br>
><br>
> Possible solutions<br>
><br>
> 1. Consider memory consumption as inevitable evil and add FPOtions to every node where it is needed.<br>
><br>
><br>
> 2. Add FPOptions to subclasses, for example FloatingUnaryOperator and similar. It however would swamp class hierarchy with duplicates and make handling AST more complex, as we use static polymorphism, not dynamic.<br>
><br>
><br>
> 3. Use trailing objects. While this way is natural when the size of additional information is variable, using trailing objects to keep just some fields of an object is inconvenient.<br>
><br>
><br>
> 4. Keep FPOptions is a special node, say FloatingPointRegionStmt. As now FP environment can be modified only at block and function level, actually the same FPOption is replicated to all nodes of that block. This FloatingPointRegionStmt would be similar to nodes like ExprWithCleanups, as it would represent a notion rather than source code construct. We also could embed FPOptions directly into CompoundStmt, but we must also provide similar facility at least for initializers and default arguments.<br>
><br>
> I am inclined to the solution 4, as it reduces size consumption and at the same time does not limit implementation of FPOptions.<br>
><br>
><br>
> Thanks,<br>
> --Serge<br>
</blockquote></div></div>