[llvm-dev] Rotates, once again

Mon Jul 2 12:50:38 PDT 2018

> On Jul 2, 2018, at 3:41 PM, Krzysztof Parzyszek via llvm-dev <llvm-dev at lists.llvm.org> wrote:
> 
> On 7/2/2018 11:27 AM, Sanjay Patel via llvm-dev wrote:
>> Let's settle on the intrinsic definition(s).
>> 1. There was a suggestion to generalize rotate to a "valign" or "double shift" (that's what x86 means with its poorly worded "double precision shift"). How does that work with vector types? The options are a full vector-size shift or a per-element shift. If it's the full vector, do we still want/need a specialized rotate intrinsic for per-element? If it's per-element, do we still want/need the other form for a full vector?
> 
> The scalar rotate moves bits and such an operation doesn't make much sense for moving data across lanes in vectors. I voted for the valign variant originally, but I think that a per-element rotate would be the natural vector version of the scalar operation.
> 
> It could still be doing the "double shift", since it's more general, it just shouldn't be called valign. A per-byte cross-lane vector rotate (an actual vector valign) can be implemented using shuffle, so I don’t think that an intrinsic for that is necessary.

Agreed. The per-element definition is the correct one.

– Steve