[PATCH] Returns NaN for sqrt with negative fp argument
James Molloy
james.molloy at arm.com
Wed Jun 11 00:14:46 PDT 2014
Hi all,
Could I please weigh in here? I’d like the behaviour to be NaN too, but for a different reason.
I’ve just found a situation (in an android benchmark) where we’re significantly behind GCC. GCC does this optimization:
# cat test.c
double g(double p) {
return sqrt(1.0 - sin(p));
}
# gcc -O3
g:
stp x29, x30, [sp, -16]!
add x29, sp, 0
bl sin
fmov d1, 1.0e+0
fsub d1, d1, d0
fsqrt d0, d1
fcmp d0, d0
bne .L5
.L3:
ldp x29, x30, [sp], 16
ret
.L5:
fmov d0, d1
bl sqrt
b .L3
That is, it optimistically uses the sqrt instruction, then falls back to the real sqrt if something went wrong. Importantly, the way to detect if something went wrong is "check if the result is NaN". This optimization could easily be implemented in LLVM and I am actively looking at doing so (GCC does it at -O1 for at least X86, ARM and AArch64). But it'll only work if llvm.sqrt() returns NaN on out-of-domain inputs!
Does this swing the case?
Cheers,
James
From: llvm-commits-bounces at cs.uiuc.edu [mailto:llvm-commits-bounces at cs.uiuc.edu] On Behalf Of Jiangning Liu
Sent: 11 June 2014 05:45
To: Tim Northover
Cc: llvm-commits at cs.uiuc.edu for LLVM
Subject: Re: [PATCH] Returns NaN for sqrt with negative fp argument
Tim,
2014-06-10 19:01 GMT+08:00 Tim Northover <t.p.northover at gmail.com>:
> I think now I understand this a little bit. Following LLVM IR spec, when
> using -ffast-math, -menable-no-nans are -menable-unsafe-fp-math enabled, and
> this sounds reasonable to retain defined/stable/safe behavior, so we can't
> return NaN, then 0.0 is a choice.
I think we probably *could* return a NaN, other phases would just be
perfectly entitled to assume we hadn't. Which makes it a not
particularly useful thing to return.
0.0 probably isn't too bad either though; the advantage of "undef" is
that it looks dodgy when scanning through IR so might make tracking
down bugs easier.
If the "undef" you are talking about is the one in LLVM IR, I think it is not OK, because undef means an undefined value, and for well-defined program, this undef shouldn't affect semantic. But for the case like sqrt(-2.01), the original C code is trying to use NaN, which is defined value (if we ignore errono) to indicate the number is invalid state. For example, for LLVM IR undef, if we change it to be fixed number (e.g. 0) for shuffle vector mask, the program logic should retain the same, but for the case of using sqrt(-2.01) to define NaN in C, we can't change it to be a fixed number, and the program would fail otherwise.
Anyway, I see a lot of discussions on internet saying fast-math implies no checks for NaN, If users' code logic depends on NaN, fast-math should just not work.
Another interesting finding is gcc aarc64 and gcc x86 behave differently for -ffast-math. For gcc x86, fp zero is being used, but gcc aarch64 is either keeping sqrt lib call or using sqrt instruction. I would say gcc aarch64 behaves unexpectedly, but I couldn't simply say it is incorrect for fast-math. Or maybe the fp zero return for sqrt(-2.01) came from gcc x86 behavior originally?
Thanks,
-Jiangning
Cheers.
Tim.
More information about the llvm-commits
mailing list