[PATCH] D18593: [PowerPC] Front end improvements for vec_splat

Nemanja Ivanovic via llvm-commits llvm-commits at lists.llvm.org
Mon Apr 4 10:00:14 PDT 2016


nemanjai added a comment.

In http://reviews.llvm.org/D18593#391192, @amehsan wrote:

> @nemanjai I was just thinking of something else :)
>
> If we have a runtime value for the second parameter, do we know that this code is an improvement over the previous implementation? (Specially if values of the runtime parameter has more or less equal frequency which code result in a lot of mispredicted branches.)
>
> Something else to consider: (I have not yet checked it). If we specialize vec_perm implementation for vec_splat with runtime values, can we improve it? If yes, how would that compare to the current implementation.


I don't see any cases in which the old implementation leads to better code than what this patch suggests. I'm focusing on LE here.
Old implementation (noopt - both const and non-const index): 16 stb's, a mess of both vector and scalar ops, vector loads and stores, etc.
Old implementation (-O1 - const index): vspltw (plus we lose the information about the shuffle in some cases - which prompted this patch)
Old implementation (-O1 - non-const index): vspltisb, some rlwimi's, 16 stb's, lxvd2x along with the requisite swap, xxlor and vperm (the bulk of the work is for building the mask)

This patch (noopt - both const and non-const index): cmplwi, bgt, mtctr, bctr, xxspltw (along with the necessary swaps of the vectors and storing the arguments)
This patch (-O1 - const index): xxspltw (and we always retain the information about this being a shufflevector since we are using that builtin)
This patch (-O1 - non-const index): rlwinm, worst case - 3 cmplwi and 3 beq/bne

Although the noopt code emitted even with this patch has some SPR operations that are likely expensive, I don't think it is ever worse than 16 stores and a load (along with other instructions necessary to create a mask vector). And I am not sure how much merit there is in discussing performance characteristics of code generated at noopt.


Repository:
  rL LLVM

http://reviews.llvm.org/D18593





More information about the llvm-commits mailing list