[PATCH] D43441: [X86][AVX512DQ] Use packed instructions for scalar FP<->i64 conversions on 32-bit targets (PR31630)

David Kreitzer via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue May 15 15:36:50 PDT 2018


DavidKreitzer added a comment.

Thanks, Craig.



================
Comment at: lib/Target/X86/X86ISelLowering.cpp:25279
+      SDValue Res = DAG.getNode(ISD::INSERT_VECTOR_ELT, dl, VecInVT,
+                                DAG.getConstantFP(0.0, dl, VecInVT),
+                                Src, ZeroIdx);
----------------
DavidKreitzer wrote:
> RKSimon wrote:
> > craig.topper wrote:
> > > delena wrote:
> > > > Why do you need to insert into zero vector? Can you insert to undef?
> > > I think so. I asked the same question before I commandeered it. It's probably no worse than the widening with undef we do for v2f32 legalization.
> > In the original patch I was just trying to be very sure there wasn't anything in the other source elements that could cause fp exceptions/overflow flags etc.
> I think the only possible side effect from the other source element being undef is raising "inexact". Do we care?
> 
Forget what I wrote. I was thinking of the INT-->FP case. For the FP-->INT case, why wouldn't we need to worry about raising spurious exceptions?

Also, I'm probably missing something, but it looks like this code is expected to kick in for both 32-bit and 64-bit and both signed and unsigned FP-->i64. Is that intentional? For the 64-bit signed case, I would think we would prefer CVTTSx2SI.


https://reviews.llvm.org/D43441





More information about the llvm-commits mailing list