[PATCH] D11316: [X86] -- Fix fptoui i64 conversions for IA32 (performance and correctness)

Fri Jul 17 16:30:35 PDT 2015

mbodart created this revision.
mbodart added a subscriber: llvm-commits.

Fixes assertion failures when using fptosi or fptoui with f80 and AVX512.  The operation actions need to be left as Custom, not Legal.  The legality for f32/f64 is handled by having FP_TO_INTHelper do nothing for those cases.

Suppresses use of the MSVC ftol2 library function for fptoui i64.  That function performs a conversion to *signed* i64, so the results were incorrect for source values >= 2^63.  I didn't rip out the now dead references to ftol2, but that can be done with a small followup change set if desired.

Implements an inline sequence for fptoui i64 for 32-bit X86.  This is mostly in FP_TO_INTHelper, replacing the ftol2 usage on Windows, and replacing the calls to fixuns{sf,df,xf}di for non-windows.  Improves performance by 6X under SSE3, 3X otherwise.

http://reviews.llvm.org/D11316

Files:
  lib/Target/X86/X86ISelLowering.cpp
  test/CodeGen/X86/pr17631.ll
  test/CodeGen/X86/scalar-fp-to-i64.ll
  test/CodeGen/X86/win_ftol2.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D11316.30048.patch
Type: text/x-patch
Size: 17980 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150717/486acf41/attachment.bin>