[PATCH] D18593: [PowerPC] Front end improvements for vec_splat

Eric Christopher via llvm-commits llvm-commits at lists.llvm.org
Mon Apr 4 11:05:16 PDT 2016


I'll get to the rest of it, but for this Kit's question about O2: It's
preferred to test such things in the back end.

On Mon, Apr 4, 2016 at 10:51 AM Kit Barton <kbarton at ca.ibm.com> wrote:

> kbarton requested changes to this revision.
> This revision now requires changes to proceed.
>
> ================
> Comment at: lib/Headers/altivec.h:7726-7736
> @@ -7725,7 +7725,13 @@
>  vec_splat(vector signed int __a, unsigned const int __b) {
> -  unsigned char b0 = (__b & 0x03) * 4;
> -  unsigned char b1 = b0 + 1, b2 = b0 + 2, b3 = b0 + 3;
> -  return vec_perm(__a, __a,
> -                  (vector unsigned char)(b0, b1, b2, b3, b0, b1, b2, b3,
> b0, b1,
> -                                         b2, b3, b0, b1, b2, b3));
> +  const unsigned __elem = __b & 0x3;
> +  switch(__elem) {
> +  case 0:
> +    return __builtin_shufflevector(__a, __a, 0, 0, 0, 0);
> +  case 1:
> +    return __builtin_shufflevector(__a, __a, 1, 1, 1, 1);
> +  case 2:
> +    return __builtin_shufflevector(__a, __a, 2, 2, 2, 2);
> +  case 3:
> +    return __builtin_shufflevector(__a, __a, 3, 3, 3, 3);
> +  }
>  }
> ----------------
> nemanjai wrote:
> > nemanjai wrote:
> > > amehsan wrote:
> > > > amehsan wrote:
> > > > > Why do we need a switch case here (and in the ones below) instead
> of just returning __builtin_shufflevector(__a, __a, __elem, __elem, __elem,
> __elem)?
> > > > Apparently phabricator has a problem with underlines....
> > > The arguments to __builtin_shufflevector must be compile-time
> constants. These vec_splat functions will typically be called with
> compile-time constants anyway so at anything other than -O0, the
> unnecessary branches will be eliminated. And if they're called with
> non-constant arguments for the element index, the sequence is probably
> still better than building a mask vector and using vperm.
> > >
> > > Basically, at -O2 this form when the argument is variable will have a
> shift, compare, branch and xxspltw. The previous form has 16 stores (of
> each byte), a vector load (which includes a swap on LE) and a vperm.
> > Yeah, when entering code in phabricator comments, you can do one of two
> things:
> > 1. Put them in a code block:
> > ```
> > __builtin_shufflevector(__a, __a, 3, 3, 3, 3);
> > ```
> > 2. Use the monospaced tag around the text: `__builtin_shufflevector(__a,
> __a, 3, 3, 3, 3);`
> Could we add a test case (or extend an existing test case) to ensure that
> these extra branches are cleaned up at -O2 and above?
> The branches will hinder instruction scheduling (among other things) so we
> should ensure they are cleaned up.
> I agree the extra branches at -O0 are not a concern.
>
>
> Repository:
>   rL LLVM
>
> http://reviews.llvm.org/D18593
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160404/3a65164b/attachment.html>


More information about the llvm-commits mailing list