[PATCH] D58197: [x86] vectorize more cast ops in lowering to avoid register file transfers
Sanjay Patel via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Feb 22 07:55:23 PST 2019
spatel marked an inline comment as done.
spatel added inline comments.
================
Comment at: llvm/trunk/test/CodeGen/X86/vec_int_to_fp.ll:5874
+; AVX512VLDQ-NEXT: vpermilps {{.*#+}} xmm0 = xmm0[3,1,2,3]
+; AVX512VLDQ-NEXT: vcvtudq2pd %xmm0, %ymm0
+; AVX512VLDQ-NEXT: # kill: def $xmm0 killed $xmm0 killed $ymm0
----------------
spatel wrote:
> efriedma wrote:
> > Why not "vcvtudq2pd %xmm0, %xmm0"?
> We're only matching the generic UINT_TO_FP node, so we go from <4 x i32> to <4 x double>. That's also why the SSE targets don't get the similar SINT_TO_FP test above here. I can look into how the SINT_TO_FP example gets narrowed and try to make that happen here too.
>
> There's no documentation that the generic nodes can change the number of elements in the vector, so I'm assuming they don't have that ability. Currently, we use the X86ISD::CVTSI2P for those patterns, so I think we need to extend the matching logic to handle that case to solve this more completely.
rL354675
Repository:
rL LLVM
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D58197/new/
https://reviews.llvm.org/D58197
More information about the llvm-commits
mailing list