[PATCH] D93132: [SVE][CodeGen] Vector + immediate addressing mode for masked gather/scatter
David Sherwood via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Dec 14 01:20:03 PST 2020
david-arm added a comment.
Hi @kmclaughlin, I've just got one function left to review, but your patch looks good so far with lots of thorough testing! I just had a few minor comments.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:3981
+ SDValue Ops[] = {Chain, Mask, BasePtr, Index, InputVT, PassThru};
+ if (ResNeedsSignExtend)
+ Opcode = getSignExtendedGatherOpcode(Opcode);
----------------
Could this be hoisted out to just below the definition of 'Opcode' on line 3976? This way you might be able to avoid duplicating the code below as well on line 3986?
================
Comment at: llvm/test/CodeGen/AArch64/sve-masked-gather.ll:4
+
+define <vscale x 2 x i64> @masked_gather_nxv2i8(<vscale x 2 x i8*> %ptrs, i64 %offset, <vscale x 2 x i1> %mask) {
+; CHECK-LABEL: masked_gather_nxv2i8:
----------------
In this function and others in the file it doesn't look like %offset is actually used - can this be removed?
================
Comment at: llvm/test/CodeGen/AArch64/sve-masked-gather.ll:43
+
+define <vscale x 2 x half> @masked_gather_nxv2f16(<vscale x 2 x half*> %ptrs, i64 %offset, <vscale x 2 x i1> %mask) {
+; CHECK-LABEL: masked_gather_nxv2f16:
----------------
Don't know if it matters or not, but the scatter equivalents also have tests for bfloat?
================
Comment at: llvm/test/CodeGen/AArch64/sve-masked-scatter-legalise.ll:15
+; Code generate store of an illegal datatype via promotion.
+define void @masked_scatter_nxv2i16(<vscale x 2 x i16> %data, i16* %base, <vscale x 2 x i16*> %ptrs, <vscale x 2 x i1> %mask) {
+; CHECK-LABEL: masked_scatter_nxv2i16:
----------------
Do we need to worry about stuff like
masked_scatter_nxv2half
masked_scatter_nxv2float?
================
Comment at: llvm/test/CodeGen/AArch64/sve-masked-scatter-vec-plus-imm.ll:79
+; CHECK: // %bb.0:
+; CHECK-NEXT: mov x8, xzr
+; CHECK-NEXT: add z1.d, z1.d, #32 // =0x20
----------------
I think this looks fine, but I was expecting something like:
mov x8, #32
st1b { z0.d }, p0, [x8, z1.d]
Perhaps there is a technical reason why we add to the z1 register? Anyway, it's only a minor codegen issue and we can look at later!
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D93132/new/
https://reviews.llvm.org/D93132
More information about the llvm-commits
mailing list