[PATCH] Flag to enable IEEE-754 friendly FP optimizations

Tue May 26 09:53:59 PDT 2015

Hi Sergey —

I don’t feel qualified to review all of these patches, but I will speak to the global direction being an important step forward.  If someone feels qualified to review the low-level details of the patches, I think that these should go in.

Thanks for your persistence!
– Steve

> On May 26, 2015, at 9:43 AM, Sergey Dmitrouk <sdmitrouk at accesssoftek.com> wrote:
> 
> Hi there,
> 
> Nope, I didn't forget about these changes :)  Ping!
> 
> Patches rebased onto ToT can be found in the attachment.
> 
> -- 
> Sergey
> 
> On Fri, Mar 27, 2015 at 09:13:44AM -0700, Mehdi Amini wrote:
>> Hi Sergey,
>> 
>> I’m sorry that in 6 weeks nobody was able to approve your patches despite your pings.
>> 
>> The code LGTM, but I don’t feel qualified enough to approve the global direction where this leads and I rather have someone else reviewing the “conceptual” part.
>> 
>> Add CC Hal, who was involved at the beginning of the thread.
>> 
>> Best,
>> 
>> —
>> Mehdi
>> 
>>> On Mar 26, 2015, at 4:28 AM, Sergey Dmitrouk <sdmitrouk at accesssoftek.com> wrote:
>>> 
>>> Ping.
>>> 
>>> On Mon, Mar 16, 2015 at 09:55:45PM +0200, Sergey Dmitrouk wrote:
>>>> Gentle ping.
>>>> 
>>>> On Thu, Feb 26, 2015 at 01:33:19PM +0200, Sergey Dmitrouk wrote:
>>>>> Ping.
>>>>> 
>>>>> On Thu, Feb 19, 2015 at 05:34:26PM +0200, Sergey Dmitrouk wrote:
>>>>>> Ping.
>>>>>> 
>>>>>> On Tue, Feb 10, 2015 at 08:12:13PM +0200, Sergey Dmitrouk wrote:
>>>>>>> Attached is alternative fix for InstCombine hanging, it disables only a
>>>>>>> couple of conversions of "fadd" into "fsub" to break the loop.
>>>>>>> 
>>>>>>> Modifying dyn_castFNegVal() directly causes some unit tests to fail, so
>>>>>>> absence of checks seems to be intensional to enable more transformations.
>>>>>>> As there exist "fsub" to "fadd" transformations and vice versa, making
>>>>>>> some forms as canonical would disable at least one of them.  So the
>>>>>>> patch disables replacements of "fadd" with "fsub" for "fadd"
>>>>>>> instructions marked with floating-point environment access flags.
>>>>>>> 
>>>>>>> Regards,
>>>>>>> Sergey
>>>>>>> 
>>>>>>> On Thu, Feb 05, 2015 at 07:56:56PM +0200, Sergey Dmitrouk wrote:
>>>>>>>> Hello,
>>>>>>>> 
>>>>>>>> Thanks for reviewing this.
>>>>>>>> 
>>>>>>>> On Wed, Feb 04, 2015 at 10:42:35AM -0800, Mehdi Amini wrote:
>>>>>>>>> Patch 1:
>>>>>>>>> 
>>>>>>>>> The comment for:
>>>>>>>>> +    unsigned AllowFPExceptAccess : 1;
>>>>>>>>> mentions rounding and the comment for:
>>>>>>>>> +    unsigned AllowFPRoundAccess : 1;
>>>>>>>>> mentions exceptions, they are reversed.
>>>>>>>> 
>>>>>>>> Updated.
>>>>>>>> 
>>>>>>>>> Patch 2:
>>>>>>>>> 
>>>>>>>>> Is kexc or kround flag allowed to be present at the same time as fast?
>>>>>>>> 
>>>>>>>> Yes, currently they are.  There is no particular reason for this except
>>>>>>>> for me not being able to decide whether "fast" should reset new flags or
>>>>>>>> not.  Left it unrelated to each other as they are unlikely to be used at
>>>>>>>> the same time, ready to change it if you have any suggestions.
>>>>>>>> 
>>>>>>>>> Patch 3:
>>>>>>>>> 
>>>>>>>>> isSafeToOptimizeFPOp() is not clear to me, especially how TLI->hasFloatingPointExceptions() disable any check, including rounding, while the description of this flag is "Whether the target supports or cares about preserving floating point exception behavior”, which seems orthogonal to getTarget().Options.AllowFPRoundAccess?
>>>>>>>>> I assume it is done this way to keep existing behavior? Is it transitional?
>>>>>>>> 
>>>>>>>> That's a mistake, it was fine until recent change for rounding when I
>>>>>>>> reordered if statements incorrectly.  Thanks.
>>>>>>>> 
>>>>>>>>> Patch 4:
>>>>>>>>> 
>>>>>>>>> isSafeToSpeculativelyExecute(), comment might deserve to be updated , I don’t think it is true anymore with respect to rounding:
>>>>>>>>> "Return true if the instruction does not have any effects besides calculating the result and does not have undefined behavior”.
>>>>>>>> 
>>>>>>>> Updated.
>>>>>>>> 
>>>>>>>>> canTrap(), again preserving the dynamic “rounding” is included in “trap”? Maybe the comment could be updated because it does not seem obvious to me.
>>>>>>>> 
>>>>>>>> Actually, rounding flag wasn't even used and exceptions flag was
>>>>>>>> unnecessary, so I just reverted that part.  It was required with flags
>>>>>>>> being detached from instructions, but now that they are part of
>>>>>>>> fast-math flags checking constant expressions should be no-operation
>>>>>>>> (operations that go against flags must not be converted to constant
>>>>>>>> expressions).
>>>>>>>> 
>>>>>>>>> Patch 10: if you really want to test it, maybe the C++ unittests can do the job?
>>>>>>>> 
>>>>>>>> Thanks, added a unit-test and a correction for "frem" instruction.
>>>>>>>> 
>>>>>>>>> Patch 12: haven’t look into detail where the infinite loop occurs, but it seems like a hammer fix for the issue. The example mentioned in the test ((a + b) => (a - (-b)) => (a + b) => etc) makes me thinks that one of the two forms should be canonical and InstCombine should not make one of these two transforms. What do you think?
>>>>>>>> 
>>>>>>>> I chose this way as it should guarantee to work for all cases, but you
>>>>>>>> might be right in that something is wrong with pattern matching in
>>>>>>>> InstCombine, see below.
>>>>>>>> 
>>>>>>>> There is an implicit four step loop between InstCombiner::visitFAdd() and
>>>>>>>> InstCombiner::visitFSub():
>>>>>>>> 
>>>>>>>> 1. FAdd: Convert A to -A.
>>>>>>>> 2. FSub: Convert -A to A; switch operands.
>>>>>>>> 3. FAdd: Convert B to -B.
>>>>>>>> 4. FSub: Convert -B to B; switch operands.
>>>>>>>> 5. Go to step 1.
>>>>>>>> 
>>>>>>>> Both functions try to do something about negative values, but they
>>>>>>>> actually hang in case of positive arguments as well.  I couldn't find
>>>>>>>> any checks whether operands are negative, it seems to be "assumed".
>>>>>>>> Maybe dyn_castFNegVal() used to perform the check, but it doesn't anymore.
>>>>>>>> 
>>>>>>>> If I add `C->isNegative()` check to dyn_castFNegVal(), the issue
>>>>>>>> disappears, but I'm not sure if it's the right place.  Meaning of the
>>>>>>>> comment there:
>>>>>>>> 
>>>>>>>> // Constants can be considered to be negated values if they can be
>>>>>>>> // folded.
>>>>>>>> if (ConstantFP *C = dyn_cast<ConstantFP>(V))
>>>>>>>>   return ConstantExpr::getFNeg(C);
>>>>>>>> 
>>>>>>>> is unclear to me.  Does it suggest that the function doesn't care about
>>>>>>>> sign of the value?  In this case, checks for negative operands should be
>>>>>>>> performed outside (in visitFAdd() and visitFSub() functions).
>>>>>>>> 
>>>>>>>> Updated patches are attached (0000-recent-changes.diff contains latest
>>>>>>>> changes only).  There is nothing for InstCombine yet and a small new
>>>>>>>> change (not counting test file) for LICM pass.
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Sergey
> <0001-Add-flags-and-command-line-switches-for-FPEnv.patch><0002-Add-FPEnv-access-flags-to-fast-math-flags.patch><0003-Consider-FPEnv-access-in-SelectionDAG.patch><0004-Skip-constant-folding-to-preserve-FPEnv.patch><0005-Teach-IR-builder-and-folders-about-new-flags.patch><0006-Do-not-fold-constants-on-reading-in-IR-asm-bitcode.patch><0007-Prevent-undesired-folding-by-InstSimplify.patch><0008-Do-not-simplify-expressions-with-FPEnv-access.patch><0009-Make-Strict-flag-available-for-more-clients.patch><0010-Use-Strict-in-IRBuilder.patch><0011-Don-t-convert-fpops-to-constexprs-in-SCCP.patch><0012-Prevent-InstCombine-from-hanging.patch><0013-Don-t-hoist-FP-ops-with-side-effects-in-LICM.patch>