[PATCH] D138203: [AArch64][InstCombine] Simplify repeated complex patterns in dupqlane

Sander de Smalen via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jan 4 03:21:41 PST 2023


sdesmalen added inline comments.


================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:1437
 
+static SmallVector<Value *> SimplifyValuePattern(SmallVector<Value *> Vec) {
+  std::size_t VecSize = Vec.size();
----------------
It's better to take and return an ArrayRef<Value*>, rather than a SmallVector object.


================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:1438
+static SmallVector<Value *> SimplifyValuePattern(SmallVector<Value *> Vec) {
+  std::size_t VecSize = Vec.size();
+  if (VecSize < 2 || !isPowerOf2_64(VecSize))
----------------
nit: you can drop the `std::` here.


================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:1443
+
+  SmallVector<Value *> Lhs(Vec.begin(), Vec.begin() + HalfVecSize);
+  SmallVector<Value *> Rhs(Vec.begin() + HalfVecSize, Vec.end());
----------------
if the code is not going to modify the content of the array, it's better to use ArrayRef to avoid creating a new object and copying (existing) data.


================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:1476-1477
+        dyn_cast<InsertElementInst>(CurrentInsertElt);
+    if (!InsertElt)
+      break;
+    auto Idx = cast<ConstantInt>(InsertElt->getOperand(2));
----------------
When the loop `break`s here, it will leave the other elements in InsertEltVec as `nullptr` from which you seem to infer `poison` or `undef` in SimplifyValuePattern.
That's not always correct, because the IR could be inserting into some other vector, e.g. `<4 x i32> %arg` where these values are defined (it would be good to have a test for this)


================
Comment at: llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-dupqlane.ll:86
+
+define dso_local <vscale x 8 x half> @dupq_f16_complex_missing_middle(half %a, half %b) {
+; CHECK-LABEL: @dupq_f16_complex_missing_middle(
----------------
MattDevereau wrote:
> sdesmalen wrote:
> > This case could still work right? If the two elements that are missing are both undef, they could be anything including `a, b`.
> The case can definitely work, however when re-implenting this patch as a recursive algorithm I ran into a few headaches when trying to integrate null pointers/poision values into the recursion. It is entirely possible to do this, however if there is minimal gain for the time required this I'd suggest it be done as a separate patch.
> 
> Some cases I ran into issues with were how to handle cases such as:
> `<a, b, c, nullptr, a, b, c, d>`, where nullptr respresents poison elements.  Logically you'd want to pick the right half as a pattern as it has no undefined values, but things start getting complicated with cases such as `<a, b, nullptr, nullptr, nullptr, nullptr, nullptr, d>`, and `<a, nullptr, a, nullptr, nullptr, b, nullptr, b>` It should be possible to simplify these, however I suspect it would be easier to write a separate algorithm from what I've done here to handle poison cases. 
I wouldn't expect the undef/poison case to be that difficult to handle. 

E.g. for `<a, b, _, _, _, _, _, d>` it could compare `<a, b, _, _>` with `<_, _, _, d>`, which would match the most specific values `<a, b, _, d>`.
For `<a, _, a, _, _, b, _, b>` it could compare `<a, _, a, _>` with `<_, b, _, b>` which would match `<a, b, a, b>`, which could then be further broken down to `<a, b>`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D138203/new/

https://reviews.llvm.org/D138203



More information about the llvm-commits mailing list