[PATCH] D43441: [X86][AVX512DQ] Use packed instructions for scalar FP<->i64 conversions on 32-bit targets (PR31630)
David Kreitzer via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue May 15 08:37:40 PDT 2018
DavidKreitzer added a comment.
Could you please add full context to the patch, Craig?
================
Comment at: lib/Target/X86/X86ISelLowering.cpp:25279
+ SDValue Res = DAG.getNode(ISD::INSERT_VECTOR_ELT, dl, VecInVT,
+ DAG.getConstantFP(0.0, dl, VecInVT),
+ Src, ZeroIdx);
----------------
RKSimon wrote:
> craig.topper wrote:
> > delena wrote:
> > > Why do you need to insert into zero vector? Can you insert to undef?
> > I think so. I asked the same question before I commandeered it. It's probably no worse than the widening with undef we do for v2f32 legalization.
> In the original patch I was just trying to be very sure there wasn't anything in the other source elements that could cause fp exceptions/overflow flags etc.
I think the only possible side effect from the other source element being undef is raising "inexact". Do we care?
================
Comment at: lib/Target/X86/X86ISelLowering.cpp:16048
+
+ if (!Subtarget.hasDQI() || SrcVT != MVT::i64 || Subtarget.is64Bit() ||
+ (VT != MVT::f32 && VT != MVT::f64))
----------------
I suspect we also want to use this vector sequence for unsigned i64 conventions on 64-bit.
https://reviews.llvm.org/D43441
More information about the llvm-commits
mailing list