[PATCH] D9804: Optimize scattered vector insert/extract pattern

Thu Sep 24 02:15:12 PDT 2015

hfinkel added a subscriber: hfinkel.
hfinkel added a comment.

At a high level, this transformation seems overly restrictive, and will need cost-modeling work. A couple of thoughts:

1. I don't see why you're restricting this to extracts used by stores (or inserts fed by loads); if the goal is to save on [zs]ext instructions, then this seems profitable regardless of how these are used. Moreover, I don't understand why there's a hasOneUse() check on the [zs]ext instructions.
2. The [zs]ext instructions that you're trying to eliminate might be free, at least in combination with the extract or insert, rendering this a bad idea. Consider the (unfortunately common) case where the target does not actually support a vector extract at all, and so it is lowered by storing the vector on the stack and then doing a scalar load of the requested element. In this case, if the target supports the corresponding scalar extending load, the extension is free. Likewise, for those [zs]ext fed by loads, these might be free if the target supports the corresponding extending load. Worse, the vector [zs]ext you're forming might not be legal at all (this is the most-serious potential problem).

================
Comment at: lib/Transforms/Vectorize/SLPVectorizer.cpp:4215
@@ +4214,3 @@
+  // TODO: There can be multiple interleaving chains
+  //       embeeded in the candidates.
+  //       We need an extra step to break them up.
----------------
embeeded -> embedded

================
Comment at: lib/Transforms/Vectorize/SLPVectorizer.cpp:4223
@@ +4222,3 @@
+
+  if (Size != NumElements)
+    return false;
----------------
Why? This does not seem necessary. It seems as though this could be profitable for any Size >= 2*(number of underlying vector ext instructions).

Repository:
  rL LLVM

http://reviews.llvm.org/D9804