[PATCH] D158487: [PowerPC][altivec] Optimize codegen of vec_promote
Nemanja Ivanovic via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Wed Aug 23 14:04:17 PDT 2023
nemanjai accepted this revision.
nemanjai added a comment.
This revision is now accepted and ready to land.
LGTM. This is a good idea and we should go ahead with this for anyone that uses `vec_promote`, but it might be a good idea to improve codegen for the insert which might be more common.
================
Comment at: llvm/test/CodeGen/PowerPC/vec-promote.ll:43
+
+define noundef <4 x float> @vec_promote_float_zeroed(ptr nocapture noundef readonly %p) {
+; CHECK-BE-LABEL: vec_promote_float_zeroed:
----------------
This code is absolutely terrible. Not only is the `lfs` super slow compared to `lfiwzx/lxsiwzx` that we actually want, but the two conversions and three permutes are super slow.
I think the change to `altivec.h` to produce better code for something like that is a good thing, but I wonder if something like this might come up in other contexts.
At least on Power9 and up, we can do much better than this. We don't do particularly well regardless of whether we're using a zero vector input or an arbitrary vector: https://godbolt.org/z/79fx8nsdP
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D158487/new/
https://reviews.llvm.org/D158487
More information about the cfe-commits
mailing list