[LLVMdev] [llvm-commits] rotate

Arnaud A. de Grandmaison arnaud.allarddegrandmaison at parrot.com
Tue Jul 31 10:22:29 PDT 2012

On 07/31/2012 06:17 PM, Eli Friedman wrote:
> On Tue, Jul 31, 2012 at 8:42 AM, Cameron McInally
> <cameron.mcinally at nyu.edu> wrote:
>> Andy,
>> Here is the left circular shift operator patch. I apologize to the reviewer
>> in advance. The patch has a good bit of fine detail. Any
>> comments/criticisms?
>> Some caveats...
>> 1) This is just the bare minimum needed to make the left circular shift
>> operator work (e.g. no instruction combining).
>> 2) I tried my best to select operator names in the existing style; please
>> feel free to change them as appropriate.
> We intentionally haven't included a rotate instruction in LLVM in the
> past; the justification is that it's generally straightforward for the
> backend to form rotate operations, and making the optimizer
> effectively handle the new rotation instruction adds a substantial
> amount of complexity.  You're going to need to make a strong argument
> that the current approach is insufficient if you want to commit a
> patch like this.

I believe something is currently broken with respect to forming rotate
instructions :

For example, using a recent clang/llvm on linux/x86_64 :

uint32_t ror32(uint32_t input, size_t rot_bits) {
  return (input >> rot_bits) | (input << ((sizeof(input) << 3) - rot_bits));

uint32_t rol32(uint32_t input, size_t rot_bits) {
  return (input << rot_bits) | (input >> ((sizeof(input) << 3) - rot_bits));

gives the expected ror and rol instructions, but their 16bits counter
parts :

uint16_t ror16(uint16_t input, size_t rot_bits) {
  return (input >> rot_bits) | (input << ((sizeof(input) << 3) - rot_bits));

uint16_t rol16(uint16_t input, size_t rot_bits) {
  return (input << rot_bits) | (input >> ((sizeof(input) << 3) - rot_bits));

fail miserably :

        .globl  ror16
        .align  16, 0x90
        .type   ror16, at function
ror16:                                  # @ror16
# BB#0:                                 # %entry
        movb    %sil, %cl
        movl    %edi, %eax
        shrl    %cl, %eax
        movl    $16, %ecx
        subl    %esi, %ecx
                                        # kill: CL<def> CL<kill> ECX<kill>
        shll    %cl, %edi
        orl     %eax, %edi
        movzwl  %di, %eax
        .size   ror16, .Ltmp2-ror16

        .globl  rol16
        .align  16, 0x90
        .type   rol16, at function
rol16:                                  # @rol16
# BB#0:                                 # %entry
        movb    %sil, %cl
        movl    %edi, %eax
        shll    %cl, %eax
        movl    $16, %ecx
        subl    %esi, %ecx
                                        # kill: CL<def> CL<kill> ECX<kill>
        shrl    %cl, %edi
        orl     %eax, %edi
        movzwl  %di, %eax
        .size   rol16, .Ltmp3-rol16

At a quick first glance, this seems to be related to the values being
promoted from i16 to i32 in the IR optimization passes, but this may not
be the only reason.

Arnaud de Grandmaison

More information about the llvm-dev mailing list