[PATCH] D138203: [AArch64][InstCombine] Simplify repeated complex patterns in dupqlane
Sander de Smalen via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Jan 4 03:21:41 PST 2023
sdesmalen added inline comments.
================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:1437
+static SmallVector<Value *> SimplifyValuePattern(SmallVector<Value *> Vec) {
+ std::size_t VecSize = Vec.size();
----------------
It's better to take and return an ArrayRef<Value*>, rather than a SmallVector object.
================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:1438
+static SmallVector<Value *> SimplifyValuePattern(SmallVector<Value *> Vec) {
+ std::size_t VecSize = Vec.size();
+ if (VecSize < 2 || !isPowerOf2_64(VecSize))
----------------
nit: you can drop the `std::` here.
================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:1443
+
+ SmallVector<Value *> Lhs(Vec.begin(), Vec.begin() + HalfVecSize);
+ SmallVector<Value *> Rhs(Vec.begin() + HalfVecSize, Vec.end());
----------------
if the code is not going to modify the content of the array, it's better to use ArrayRef to avoid creating a new object and copying (existing) data.
================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:1476-1477
+ dyn_cast<InsertElementInst>(CurrentInsertElt);
+ if (!InsertElt)
+ break;
+ auto Idx = cast<ConstantInt>(InsertElt->getOperand(2));
----------------
When the loop `break`s here, it will leave the other elements in InsertEltVec as `nullptr` from which you seem to infer `poison` or `undef` in SimplifyValuePattern.
That's not always correct, because the IR could be inserting into some other vector, e.g. `<4 x i32> %arg` where these values are defined (it would be good to have a test for this)
================
Comment at: llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-dupqlane.ll:86
+
+define dso_local <vscale x 8 x half> @dupq_f16_complex_missing_middle(half %a, half %b) {
+; CHECK-LABEL: @dupq_f16_complex_missing_middle(
----------------
MattDevereau wrote:
> sdesmalen wrote:
> > This case could still work right? If the two elements that are missing are both undef, they could be anything including `a, b`.
> The case can definitely work, however when re-implenting this patch as a recursive algorithm I ran into a few headaches when trying to integrate null pointers/poision values into the recursion. It is entirely possible to do this, however if there is minimal gain for the time required this I'd suggest it be done as a separate patch.
>
> Some cases I ran into issues with were how to handle cases such as:
> `<a, b, c, nullptr, a, b, c, d>`, where nullptr respresents poison elements. Logically you'd want to pick the right half as a pattern as it has no undefined values, but things start getting complicated with cases such as `<a, b, nullptr, nullptr, nullptr, nullptr, nullptr, d>`, and `<a, nullptr, a, nullptr, nullptr, b, nullptr, b>` It should be possible to simplify these, however I suspect it would be easier to write a separate algorithm from what I've done here to handle poison cases.
I wouldn't expect the undef/poison case to be that difficult to handle.
E.g. for `<a, b, _, _, _, _, _, d>` it could compare `<a, b, _, _>` with `<_, _, _, d>`, which would match the most specific values `<a, b, _, d>`.
For `<a, _, a, _, _, b, _, b>` it could compare `<a, _, a, _>` with `<_, b, _, b>` which would match `<a, b, a, b>`, which could then be further broken down to `<a, b>`.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D138203/new/
https://reviews.llvm.org/D138203
More information about the llvm-commits
mailing list