[PATCH] D56864: [x86] vectorize cast ops in lowering to avoid register file transfers
Simon Pilgrim via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Jan 25 00:21:19 PST 2019
RKSimon added inline comments.
================
Comment at: lib/Target/X86/X86ISelLowering.cpp:17412
+ // but if we extract from some other element, it will require shuffling to
+ // get the result into the right place.
+ SDValue Extract = Cast.getOperand(0);
----------------
They're less common but add a TODO about smaller (extended) integer types?
================
Comment at: lib/Target/X86/X86ISelLowering.cpp:17415
+ MVT DestVT = Cast.getSimpleValueType();
+ if (!Extract.hasOneUse() || Extract.getOpcode() != ISD::EXTRACT_VECTOR_ELT ||
+ !isNullConstant(Extract.getOperand(1)))
----------------
Is the one use necessary? This combine should replace the scalar conversion with a vector, whether there are other uses of the scalar isn't necessarily relevant (but maybe extra instructions if we support shuffles in the future)?
================
Comment at: test/CodeGen/X86/vec_int_to_fp.ll:5685-5686
; SSE2: # %bb.0:
; SSE2-NEXT: pshufd {{.*#+}} xmm0 = xmm0[3,1,2,3]
-; SSE2-NEXT: movd %xmm0, %eax
-; SSE2-NEXT: xorps %xmm0, %xmm0
-; SSE2-NEXT: cvtsi2ssl %eax, %xmm0
+; SSE2-NEXT: cvtdq2ps %xmm0, %xmm0
; SSE2-NEXT: retq
----------------
spatel wrote:
> Not sure yet how this case became a "shuffle first and extract from 0 element", but we probably want to do that more generally to enable this transform more often.
SSE2 only supports extractelement from index #0 so the shuffle gets added to move the element there.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D56864/new/
https://reviews.llvm.org/D56864
More information about the llvm-commits
mailing list