[PATCH] D104471: [llvm][sve] Lowering for VLS truncating stores

Mon Jun 28 06:29:30 PDT 2021

DavidTruby added a comment.

In D104471#2825191 <https://reviews.llvm.org/D104471#2825191>, @efriedma wrote:

> This patch is making way too many seemingly unrelated code changes without much explanation.  Is all of this really necessary for the initial patch?

Hi, thanks for the review! I think maybe I didn't explain that well in the summary why the extra changes are needed. I've hopefully added better explanations below.

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:1225
+      for (MVT InnerVT : MVT::integer_fixedlen_vector_valuetypes()) {
+        setTruncStoreAction(VT, InnerVT, Custom);
+      }
----------------
efriedma wrote:
> Can we change this to make sure we only mark stores that we can actually lower "custom"?  (i.e. the value type is a legal integer vector, and the memory type has a legal element type.)  That should make the rest of the patch simpler, I think.
I'm not sure what doing this would allow us to remove for the rest of the patch, could you point me at it?

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:15005

+static SDValue foldTruncStoreOfExt(SelectionDAG &DAG, SDNode *N) {
+  auto OpCode = N->getOpcode();
----------------
This unrelated-appearing addition is necessary to prevent a regression in some code generation we already had.
In essence, sometimes we were generating an instruction, extending the result, truncating that and storing; we were relying on a later fold of an extend and truncate but since the fold into the truncating store happens first we were left with an extra extend then a truncating store.
This function removes that extra extend that we were generating in those cases.

I realise it's somewhat unrelated appearing to the rest of the patch but I didn't want to introduce a regression and change all the tests and then fix the regression in a separate patch and change the tests back.

================
Comment at: llvm/test/CodeGen/AArch64/sve-fixed-length-trunc-stores.ll:183
+; VBITS_EQ_256-DAG: uzp1 [[Z1]].h, [[Z1]].h, [[Z1]].h
+; VBITS_EQ_256-DAG: splice [[Z1]].h, [[PG]], [[Z1]].h, [[Z0]].h
+; VBITS_EQ_256-DAG: ptrue [[PG]].h, vl16
----------------
efriedma wrote:
> Can we "uzp1 z0, z0, z1" or something like that in store_trunc_v16i32i16?
The code generation of these legalized types is poor and needs fixing in a separate patch, as these should really be using truncating stores as well. These tests just check that legalised code is //correct// rather than good at the moment.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D104471/new/

https://reviews.llvm.org/D104471