[PATCH] D78486: [SystemZ] Expand vector zero-extend into a shuffle.
Ulrich Weigand via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Apr 30 11:47:19 PDT 2020
uweigand added a comment.
In D78486#2012867 <https://reviews.llvm.org/D78486#2012867>, @jonpa wrote:
> It certainly seems to be an improvement on two benchmarks to do just either one unpack or else a vperm (and not multiple unpacks). In fact, i505.mcf actually regressed (2.5%) when doing two unpacks instead of a vperm (with -ffp-contract=fast). So the initial idea of reducing the number of vperms seems to have been proven wrong - it is better to have a single vperm on the critical path rather than multiple unpacks.
Huh, interesting. That's good to know, and certainly makes the code simpler as well.
I'm wondering: unless I'm missing something, there's still one specific case where you generate a vperm followed by an unpack (the case where you already had a permute as source). Wouldn't it be preferable to just use a single vperm there as well?
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D78486/new/
https://reviews.llvm.org/D78486
More information about the llvm-commits
mailing list