[PATCH] D68247: [X86] Add a DAG combine to shrink vXi64 gather/scatter indices that are constant with sufficient sign bits to fit in vXi32

Tue Oct 1 14:54:49 PDT 2019

RKSimon accepted this revision.
RKSimon added a comment.
This revision is now accepted and ready to land.

LGTM

================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:42384
+  if (DCI.isBeforeLegalize()) {
+    if (auto *BV = dyn_cast<BuildVectorSDNode>(Index)) {
+      unsigned IndexWidth = Index.getScalarValueSizeInBits();
----------------
craig.topper wrote:
> RKSimon wrote:
> > craig.topper wrote:
> > > RKSimon wrote:
> > > > Does it have to be build vector? General truncations should be pretty good in these cases whatever.
> > > No. I guess you could truncate anytime the index element is wider than the data element and the index type isn't legal and is a type that needs to be split or a type that will be widened to a type that will be split. You wouldn't want to truncate if it won't prevent a split since you'd just be adding more code. Unless it was sext/zext which would be removed, but that's already handled below. Is that what you were thinking?
> > I was thinking more in terms of the high number of sign bits means that we can always cheaply truncate with VTRUNC/PACKSS
> So should we always truncate or only in the cases I said? Pack is going to require at least another shuffle to split the data into 2 sources. And VTRUNC is 2 uops by itself. I think PACKSS goes from 1 cycle latency to 3 cycle latency on icelake.
OK - it becomes a cost issue. We could use isTruncateFree to check but at the moment we don't do anything for vectors. I'm happy if we just leave it as a TODO comment for now.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D68247/new/

https://reviews.llvm.org/D68247