[PATCH] Flag to enable IEEE-754 friendly FP optimizations
Mehdi Amini
mehdi.amini at apple.com
Fri Mar 27 09:13:44 PDT 2015
Hi Sergey,
I’m sorry that in 6 weeks nobody was able to approve your patches despite your pings.
The code LGTM, but I don’t feel qualified enough to approve the global direction where this leads and I rather have someone else reviewing the “conceptual” part.
Add CC Hal, who was involved at the beginning of the thread.
Best,
—
Mehdi
> On Mar 26, 2015, at 4:28 AM, Sergey Dmitrouk <sdmitrouk at accesssoftek.com> wrote:
>
> Ping.
>
> On Mon, Mar 16, 2015 at 09:55:45PM +0200, Sergey Dmitrouk wrote:
>> Gentle ping.
>>
>> On Thu, Feb 26, 2015 at 01:33:19PM +0200, Sergey Dmitrouk wrote:
>>> Ping.
>>>
>>> On Thu, Feb 19, 2015 at 05:34:26PM +0200, Sergey Dmitrouk wrote:
>>>> Ping.
>>>>
>>>> On Tue, Feb 10, 2015 at 08:12:13PM +0200, Sergey Dmitrouk wrote:
>>>>> Attached is alternative fix for InstCombine hanging, it disables only a
>>>>> couple of conversions of "fadd" into "fsub" to break the loop.
>>>>>
>>>>> Modifying dyn_castFNegVal() directly causes some unit tests to fail, so
>>>>> absence of checks seems to be intensional to enable more transformations.
>>>>> As there exist "fsub" to "fadd" transformations and vice versa, making
>>>>> some forms as canonical would disable at least one of them. So the
>>>>> patch disables replacements of "fadd" with "fsub" for "fadd"
>>>>> instructions marked with floating-point environment access flags.
>>>>>
>>>>> Regards,
>>>>> Sergey
>>>>>
>>>>> On Thu, Feb 05, 2015 at 07:56:56PM +0200, Sergey Dmitrouk wrote:
>>>>>> Hello,
>>>>>>
>>>>>> Thanks for reviewing this.
>>>>>>
>>>>>> On Wed, Feb 04, 2015 at 10:42:35AM -0800, Mehdi Amini wrote:
>>>>>>> Patch 1:
>>>>>>>
>>>>>>> The comment for:
>>>>>>> + unsigned AllowFPExceptAccess : 1;
>>>>>>> mentions rounding and the comment for:
>>>>>>> + unsigned AllowFPRoundAccess : 1;
>>>>>>> mentions exceptions, they are reversed.
>>>>>>
>>>>>> Updated.
>>>>>>
>>>>>>> Patch 2:
>>>>>>>
>>>>>>> Is kexc or kround flag allowed to be present at the same time as fast?
>>>>>>
>>>>>> Yes, currently they are. There is no particular reason for this except
>>>>>> for me not being able to decide whether "fast" should reset new flags or
>>>>>> not. Left it unrelated to each other as they are unlikely to be used at
>>>>>> the same time, ready to change it if you have any suggestions.
>>>>>>
>>>>>>> Patch 3:
>>>>>>>
>>>>>>> isSafeToOptimizeFPOp() is not clear to me, especially how TLI->hasFloatingPointExceptions() disable any check, including rounding, while the description of this flag is "Whether the target supports or cares about preserving floating point exception behavior”, which seems orthogonal to getTarget().Options.AllowFPRoundAccess?
>>>>>>> I assume it is done this way to keep existing behavior? Is it transitional?
>>>>>>
>>>>>> That's a mistake, it was fine until recent change for rounding when I
>>>>>> reordered if statements incorrectly. Thanks.
>>>>>>
>>>>>>> Patch 4:
>>>>>>>
>>>>>>> isSafeToSpeculativelyExecute(), comment might deserve to be updated , I don’t think it is true anymore with respect to rounding:
>>>>>>> "Return true if the instruction does not have any effects besides calculating the result and does not have undefined behavior”.
>>>>>>
>>>>>> Updated.
>>>>>>
>>>>>>> canTrap(), again preserving the dynamic “rounding” is included in “trap”? Maybe the comment could be updated because it does not seem obvious to me.
>>>>>>
>>>>>> Actually, rounding flag wasn't even used and exceptions flag was
>>>>>> unnecessary, so I just reverted that part. It was required with flags
>>>>>> being detached from instructions, but now that they are part of
>>>>>> fast-math flags checking constant expressions should be no-operation
>>>>>> (operations that go against flags must not be converted to constant
>>>>>> expressions).
>>>>>>
>>>>>>> Patch 10: if you really want to test it, maybe the C++ unittests can do the job?
>>>>>>
>>>>>> Thanks, added a unit-test and a correction for "frem" instruction.
>>>>>>
>>>>>>> Patch 12: haven’t look into detail where the infinite loop occurs, but it seems like a hammer fix for the issue. The example mentioned in the test ((a + b) => (a - (-b)) => (a + b) => etc) makes me thinks that one of the two forms should be canonical and InstCombine should not make one of these two transforms. What do you think?
>>>>>>
>>>>>> I chose this way as it should guarantee to work for all cases, but you
>>>>>> might be right in that something is wrong with pattern matching in
>>>>>> InstCombine, see below.
>>>>>>
>>>>>> There is an implicit four step loop between InstCombiner::visitFAdd() and
>>>>>> InstCombiner::visitFSub():
>>>>>>
>>>>>> 1. FAdd: Convert A to -A.
>>>>>> 2. FSub: Convert -A to A; switch operands.
>>>>>> 3. FAdd: Convert B to -B.
>>>>>> 4. FSub: Convert -B to B; switch operands.
>>>>>> 5. Go to step 1.
>>>>>>
>>>>>> Both functions try to do something about negative values, but they
>>>>>> actually hang in case of positive arguments as well. I couldn't find
>>>>>> any checks whether operands are negative, it seems to be "assumed".
>>>>>> Maybe dyn_castFNegVal() used to perform the check, but it doesn't anymore.
>>>>>>
>>>>>> If I add `C->isNegative()` check to dyn_castFNegVal(), the issue
>>>>>> disappears, but I'm not sure if it's the right place. Meaning of the
>>>>>> comment there:
>>>>>>
>>>>>> // Constants can be considered to be negated values if they can be
>>>>>> // folded.
>>>>>> if (ConstantFP *C = dyn_cast<ConstantFP>(V))
>>>>>> return ConstantExpr::getFNeg(C);
>>>>>>
>>>>>> is unclear to me. Does it suggest that the function doesn't care about
>>>>>> sign of the value? In this case, checks for negative operands should be
>>>>>> performed outside (in visitFAdd() and visitFSub() functions).
>>>>>>
>>>>>> Updated patches are attached (0000-recent-changes.diff contains latest
>>>>>> changes only). There is nothing for InstCombine yet and a small new
>>>>>> change (not counting test file) for LICM pass.
>>>>>>
>>>>>> Thanks,
>>>>>> Sergey
>>>>>
>>>>
>>
More information about the llvm-commits
mailing list