[PATCH] D43441: [X86][AVX512DQ] Use packed instructions for scalar FP<->i64 conversions on 32-bit targets (PR31630)

David Kreitzer via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue May 15 08:37:40 PDT 2018


DavidKreitzer added a comment.

Could you please add full context to the patch, Craig?



================
Comment at: lib/Target/X86/X86ISelLowering.cpp:25279
+      SDValue Res = DAG.getNode(ISD::INSERT_VECTOR_ELT, dl, VecInVT,
+                                DAG.getConstantFP(0.0, dl, VecInVT),
+                                Src, ZeroIdx);
----------------
RKSimon wrote:
> craig.topper wrote:
> > delena wrote:
> > > Why do you need to insert into zero vector? Can you insert to undef?
> > I think so. I asked the same question before I commandeered it. It's probably no worse than the widening with undef we do for v2f32 legalization.
> In the original patch I was just trying to be very sure there wasn't anything in the other source elements that could cause fp exceptions/overflow flags etc.
I think the only possible side effect from the other source element being undef is raising "inexact". Do we care?



================
Comment at: lib/Target/X86/X86ISelLowering.cpp:16048
+
+   if (!Subtarget.hasDQI() || SrcVT != MVT::i64 || Subtarget.is64Bit() ||
+       (VT != MVT::f32 && VT != MVT::f64))
----------------
I suspect we also want to use this vector sequence for unsigned i64 conventions on 64-bit.



https://reviews.llvm.org/D43441





More information about the llvm-commits mailing list