[PATCH] D140069: [DAGCombiner] Scalarize vectorized loads that are splatted
Dave Green via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Dec 20 08:42:14 PST 2022
dmgreen added a comment.
Thanks. The remaining Arm and AArch64 test cases look OK to me.
================
Comment at: llvm/test/CodeGen/AArch64/arm64-vmul.ll:1227-1230
+; CHECK-NEXT: add x8, x1, #4
+; CHECK-NEXT: ldr d1, [x0]
+; CHECK-NEXT: ld1r.2s { v0 }, [x8]
+; CHECK-NEXT: fmulx.2s v0, v1, v0
----------------
luke wrote:
> @dmgreen This looks like a regression to me but I'm not familiar enough with aarch64 to really know for certain. I presume the cost of the additional add instruction outweighs any gains from a smaller load, is that correct?
>
> (Hope I'm not bombarding you with too many questions, let me know if there's someone else I can ask!)
Yeah it's the same as the other cases. I heard that it can be a little worse than if the dup could be part of the mul/fmulx/etc, but that's a separate issue from this patch. In practice many cases will already be a splat of a scalar value, so will already run into the same issue.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D140069/new/
https://reviews.llvm.org/D140069
More information about the llvm-commits
mailing list