[PATCH] D56864: [x86] vectorize cast ops in lowering to avoid register file transfers

Simon Pilgrim via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Jan 25 00:21:19 PST 2019


RKSimon added inline comments.


================
Comment at: lib/Target/X86/X86ISelLowering.cpp:17412
+  // but if we extract from some other element, it will require shuffling to
+  // get the result into the right place.
+  SDValue Extract = Cast.getOperand(0);
----------------
They're less common but add a TODO about smaller (extended) integer types?


================
Comment at: lib/Target/X86/X86ISelLowering.cpp:17415
+  MVT DestVT = Cast.getSimpleValueType();
+  if (!Extract.hasOneUse() || Extract.getOpcode() != ISD::EXTRACT_VECTOR_ELT ||
+      !isNullConstant(Extract.getOperand(1)))
----------------
Is the one use necessary?  This combine should replace the scalar conversion with a vector, whether there are other uses of the scalar isn't necessarily relevant (but maybe extra instructions if we support shuffles in the future)?


================
Comment at: test/CodeGen/X86/vec_int_to_fp.ll:5685-5686
 ; SSE2:       # %bb.0:
 ; SSE2-NEXT:    pshufd {{.*#+}} xmm0 = xmm0[3,1,2,3]
-; SSE2-NEXT:    movd %xmm0, %eax
-; SSE2-NEXT:    xorps %xmm0, %xmm0
-; SSE2-NEXT:    cvtsi2ssl %eax, %xmm0
+; SSE2-NEXT:    cvtdq2ps %xmm0, %xmm0
 ; SSE2-NEXT:    retq
----------------
spatel wrote:
> Not sure yet how this case became a "shuffle first and extract from 0 element", but we probably want to do that more generally to enable this transform more often.
SSE2 only supports extractelement from index #0 so the shuffle gets added to move the element there.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D56864/new/

https://reviews.llvm.org/D56864





More information about the llvm-commits mailing list