[llvm] r271510 - [X86][SSE] Replace (V)CVTTPS2DQ and VCVTTPD2DQ truncating (round to zero) f32/f64 to i32 with generic IR (llvm)

Eli Friedman via llvm-commits llvm-commits at lists.llvm.org
Wed Jul 6 13:27:58 PDT 2016


On Wed, Jul 6, 2016 at 1:09 PM, Simon Pilgrim <llvm-dev at redking.me.uk>
wrote:

> So that patch was mainly about matching the behaviour of _mm_cvttsd_epi32
> (cvttsd2si) which already did the scalar equivalent.
>
> It turns out there is a problem if constant folding of an out of range
> values occurs as LangRef says it should be undefined (although it actually
> sets the result to zero). But the cvttsd2si/cvttps2dq/cvttpd2dq
> instructions guarantee that the result is actually 0x80000000.
>
> I’ll prepare a reversion patch that reverts these changes (including for
> the scalar) and performs a fast-math only combine instead.
>

Okay.

I just looked into it, and it turns out I screwed this up back in 2009
(r72979).  Apparently nobody noticed. :(

-Eli
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160706/faeb0a9d/attachment.html>


More information about the llvm-commits mailing list