[PATCH] D107057: [llvm][sve] Lowering for VLS extending loads
Eli Friedman via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Aug 3 11:58:16 PDT 2021
efriedma added inline comments.
================
Comment at: llvm/test/CodeGen/AArch64/sve-fixed-length-ext-loads.ll:50-63
+
+ ; Ensure sensible type legalistaion
+ ; VBITS_EQ_256-DAG: ptrue [[PG:p[0-9]+]].h, vl16
+ ; VBITS_EQ_256-DAG: ld1h { [[Z0:z[0-9]+]].h }, [[PG]]/z, [x0]
+ ; VBITS_EQ_256-DAG: mov x9, sp
+ ; VBITS_EQ_256-DAG: st1h { [[Z0]].h }, [[PG]], [x9]
+ ; VBITS_EQ_256-DAG: ldp q[[R0:[0-9]+]], q[[R1:[0-9]+]], [sp]
----------------
bsmith wrote:
> The codegen in the type legalisation cases seems a bit odd, why is this not using SVE to do the extending load?
The fact that legalization goes through the stack is obviously just a missed optimization.
The way type legalization works, it will see that `<16 x i16>` is legal, so we do a `<16 x i16>` load. Then we have an extend of that load to an illegal type. This gets split into two parts: extract/extend the low half, then extract/extend the high half. If we optimized that correctly, it would come out to three instructions: ld1h, followed by uunpcklo/uunpckhi.
Whether that's the best approach probably depends on the target and the types involved. If extending vector loads are reasonably fast, maybe we just want to generate more of them.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D107057/new/
https://reviews.llvm.org/D107057
More information about the llvm-commits
mailing list