[PATCH] D58197: [x86] vectorize more cast ops in lowering to avoid register file transfers
Sanjay Patel via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Feb 14 06:03:06 PST 2019
spatel marked an inline comment as done.
spatel added inline comments.
================
Comment at: llvm/test/CodeGen/X86/known-signbits-vector.ll:79
; X64-NEXT: shrq $32, %rax
; X64-NEXT: vcvtsi2ssl %eax, %xmm1, %xmm0
; X64-NEXT: retq
----------------
RKSimon wrote:
> spatel wrote:
> > RKSimon wrote:
> > > Any idea why this still fails?
> > This is almost the same example as in:
> > https://bugs.llvm.org/show_bug.cgi?id=39975
> >
> > On x86-64 only (because the 64-bit shift isn't legal on i686), we scalarize the shift. So that means we have the shift sitting between the extract and cast, so no match.
> Hmm - is it worth us investigating a trunc(lshr(extract(v2i64 x, i), 32)) -> trunc(extract(v2i64 x, i+1)) combine? (and variants)
Yeah, I though we already had that transform, but that case isn't matched.
Note that this example is an ashr, not lshr, and the example in PR39975 is probably tougher because it's shift-by-33. We might want to carve out an exception for the scalarization transform for these kinds of cases in general.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D58197/new/
https://reviews.llvm.org/D58197
More information about the llvm-commits
mailing list