[PATCH] D23797: [X86][SSE] Improve awareness of (v)cvtpd2ps implicit zeroing of upper 64-bits of xmm result
Michael Kuperstein via llvm-commits
llvm-commits at lists.llvm.org
Mon Aug 29 14:47:30 PDT 2016
mkuper accepted this revision.
mkuper added a comment.
This revision is now accepted and ready to land.
LGTM.
================
Comment at: lib/Target/X86/X86InstrSSE.td:2285
@@ -2284,1 +2284,3 @@
// Match fpround and fpextend for 128/256-bit conversions
+ def : Pat<(v4f32 (bitconvert (X86vzmovl (v2f64 (bitconvert
+ (v4f32 (X86vfpround (v2f64 VR128:$src)))))))),
----------------
RKSimon wrote:
> RKSimon wrote:
> > mkuper wrote:
> > > I've only now realized we represent zeroing the two high lanes of a v4f32 with (v4f32 (bitconvert (X86vzmovl (v2f64 (bitconvert (v4f32 ...)))))) :-\
> > > But there's nothing we can really do about this, right?
> > >
> > >
> > Not much - we use VZEXT_MOVL to zero all but the first vector element.
> >
> > An alternative would be to have VZEXT32_MOVL and VZEXT64_MOVL (or something similar) - it would affect a lot of existing lowering patterns and I'm it sure its worth it. We have a number of similar bitcasting pattern situation.
> > An alternative would be to have VZEXT32_MOVL and VZEXT64_MOVL (or something similar) - it would affect a lot of existing lowering patterns and I'm it sure its worth it. We have a number of similar bitcasting pattern situation.
>
> Sorry that should say "and I'm not sure its worth it."
>
Even if we had VZEXT64_MOVL, it wouldn't help (at least, not with what bothers me), we'd still have the ugly casting back and forth.
Repository:
rL LLVM
https://reviews.llvm.org/D23797
More information about the llvm-commits
mailing list