[PATCH] D146218: [AArch64][CodeGen] Lower (de)interleave2 intrinsics to ld2/st2

Tue Jun 13 10:30:18 PDT 2023

paulwalker-arm added inline comments.

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:15214
+
+  Value *BaseAddr = Builder.CreateBitCast(LI->getPointerOperand(), PtrTy);
+  Value *Result;
----------------
I thought opaque pointers meant we didn't need to bitcast pointers anymore? I mainly mention it because I wasn't sure what type the following `CreateGEP` call returns and so figured we might either not need any bitcasting or we might need a little more :)

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:15227
+        Offset = Builder.CreateVScale(cast<Constant>(Offset));
+      BaseAddr = Builder.CreateGEP(LdTy->getElementType(), BaseAddr, {Offset});
+      Value *LdN = nullptr;
----------------
Is this correct?  `VTy` is the result type for each of the two results from the deinterleave intrinsic. `LdTy` is this divided by the number of of calls to `ld2` that you'll need, which represents the result type for each of the two results from the `ld2` intrinsic.  However `ld2` reads `2 * sizeof(LdTy)` bytes. So I think `Offset` is half the amount it needs to be.

I wonder if you can simplify the addressing logic by just using `LdTy` directly.  That way the offset can just be `I * Factor` for both fixed and scalable vectors, thus no need for `vscale`.  What do you think?

I've not looked but I'm assuming the store function has the same problem.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D146218/new/

https://reviews.llvm.org/D146218