[PATCH] D107057: [llvm][sve] Lowering for VLS extending loads

Eli Friedman via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Aug 3 11:58:16 PDT 2021


efriedma added inline comments.


================
Comment at: llvm/test/CodeGen/AArch64/sve-fixed-length-ext-loads.ll:50-63
+
+  ; Ensure sensible type legalistaion
+  ; VBITS_EQ_256-DAG: ptrue [[PG:p[0-9]+]].h, vl16
+  ; VBITS_EQ_256-DAG: ld1h { [[Z0:z[0-9]+]].h }, [[PG]]/z, [x0]
+  ; VBITS_EQ_256-DAG: mov x9, sp
+  ; VBITS_EQ_256-DAG: st1h { [[Z0]].h }, [[PG]], [x9]
+  ; VBITS_EQ_256-DAG: ldp q[[R0:[0-9]+]], q[[R1:[0-9]+]], [sp]
----------------
bsmith wrote:
> The codegen in the type legalisation cases seems a bit odd, why is this not using SVE to do the extending load?
The fact that legalization goes through the stack is obviously just a missed optimization.

The way type legalization works, it will see that `<16 x i16>` is legal, so we do a `<16 x i16>` load.  Then we have an extend of that load to an illegal type.  This gets split into two parts: extract/extend the low half, then extract/extend the high half.  If we optimized that correctly, it would come out to three instructions: ld1h, followed by uunpcklo/uunpckhi.

Whether that's the best approach probably depends on the target and the types involved.  If extending vector loads are reasonably fast, maybe we just want to generate more of them.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D107057/new/

https://reviews.llvm.org/D107057



More information about the llvm-commits mailing list