[PATCH] D27618: Failure to vectorize __builtin_sqrt/__builtin_sqrtf
Hal Finkel via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Dec 12 01:33:31 PST 2016
hfinkel added inline comments.
================
Comment at: lib/Analysis/ValueTracking.cpp:2526
case LibFunc::sqrtl:
- if (ICS->hasNoNaNs())
- return Intrinsic::sqrt;
- return Intrinsic::not_intrinsic;
+ return Intrinsic::sqrt;
}
----------------
RKSimon wrote:
> fhahn wrote:
> > hfinkel wrote:
> > > RKSimon wrote:
> > > > Do we know the history behind why sqrt was protected by this?
> > > The problem, which has come up a lot, is that our sqrt intrinsic is a special case. Unlike the other intrinsics for math functions, it does not exactly mirror the behavior of the corresponding libm function. As noted in the LangRef, "Unlike sqrt in libm, however, llvm.sqrt has undefined behavior for negative numbers other than -0.0." Thus, unless we get to assume no NaNs, it is possible that the intrinsic might have UB in cases where the libm function does not, and as a result, we can't substitute the intrinsic for the libm call.
> > I think the checks were added in https://reviews.llvm.org/rL265521 and I think the reasoning behind the check is still sound.
> >
> > LibFun::sqrt and Intrinsic::sqrt differ in the way they treat negative elements, the intrinsic has undefined behavior for negative numbers (http://llvm.org/docs/LangRef.html#llvm-sqrt-intrinsic).
> What do you think is the best approach going forward? Is there a way to detect when the target lib func will in fact lower to an instruction (as per SSE sqrtss/sqrtsd) and will handle nan correctly?
>
> Failing that it starting to sound like we should just do a fixup vectorization during x86 lowering of a build_vector of sqrt ops as I suggested in https://llvm.org/bugs/show_bug.cgi?id=27435#c3
> What do you think is the best approach going forward? Is there a way to detect when the target lib func will in fact lower to an instruction (as per SSE sqrtss/sqrtsd) and will handle nan correctly?
We can't have target-dependent semantics for a target-independent IR-level intrinsic. I'd prefer that we somehow eliminate the IR-level special case here, either by adding a new intrinsic to represent sqrt with the regular semantics or add some argument to the existing intrinsic to indicate that all negative numbers, etc. are supported. A second intrinsic is probably easier.
The fact that we have an intrinsic corresponding to the libm functions we handle except for sqrt, for which this is only true in no-NaNs mode, causes all sorts of inconveniences (and, as in this case, missed opportunities).
https://reviews.llvm.org/D27618
More information about the llvm-commits
mailing list