[LLVMdev] [llvm-commits] rotate
Arnaud A. de Grandmaison
arnaud.allarddegrandmaison at parrot.com
Tue Jul 31 10:22:29 PDT 2012
On 07/31/2012 06:17 PM, Eli Friedman wrote:
> On Tue, Jul 31, 2012 at 8:42 AM, Cameron McInally
> <cameron.mcinally at nyu.edu> wrote:
>> Andy,
>>
>> Here is the left circular shift operator patch. I apologize to the reviewer
>> in advance. The patch has a good bit of fine detail. Any
>> comments/criticisms?
>>
>> Some caveats...
>>
>> 1) This is just the bare minimum needed to make the left circular shift
>> operator work (e.g. no instruction combining).
>>
>> 2) I tried my best to select operator names in the existing style; please
>> feel free to change them as appropriate.
> We intentionally haven't included a rotate instruction in LLVM in the
> past; the justification is that it's generally straightforward for the
> backend to form rotate operations, and making the optimizer
> effectively handle the new rotation instruction adds a substantial
> amount of complexity. You're going to need to make a strong argument
> that the current approach is insufficient if you want to commit a
> patch like this.
>
Well,
I believe something is currently broken with respect to forming rotate
instructions :
For example, using a recent clang/llvm on linux/x86_64 :
uint32_t ror32(uint32_t input, size_t rot_bits) {
return (input >> rot_bits) | (input << ((sizeof(input) << 3) - rot_bits));
}
uint32_t rol32(uint32_t input, size_t rot_bits) {
return (input << rot_bits) | (input >> ((sizeof(input) << 3) - rot_bits));
}
gives the expected ror and rol instructions, but their 16bits counter
parts :
uint16_t ror16(uint16_t input, size_t rot_bits) {
return (input >> rot_bits) | (input << ((sizeof(input) << 3) - rot_bits));
}
uint16_t rol16(uint16_t input, size_t rot_bits) {
return (input << rot_bits) | (input >> ((sizeof(input) << 3) - rot_bits));
}
fail miserably :
.globl ror16
.align 16, 0x90
.type ror16, at function
ror16: # @ror16
.cfi_startproc
# BB#0: # %entry
movb %sil, %cl
movl %edi, %eax
shrl %cl, %eax
movl $16, %ecx
subl %esi, %ecx
# kill: CL<def> CL<kill> ECX<kill>
shll %cl, %edi
orl %eax, %edi
movzwl %di, %eax
ret
.Ltmp2:
.size ror16, .Ltmp2-ror16
.cfi_endproc
.globl rol16
.align 16, 0x90
.type rol16, at function
rol16: # @rol16
.cfi_startproc
# BB#0: # %entry
movb %sil, %cl
movl %edi, %eax
shll %cl, %eax
movl $16, %ecx
subl %esi, %ecx
# kill: CL<def> CL<kill> ECX<kill>
shrl %cl, %edi
orl %eax, %edi
movzwl %di, %eax
ret
.Ltmp3:
.size rol16, .Ltmp3-rol16
.cfi_endproc
At a quick first glance, this seems to be related to the values being
promoted from i16 to i32 in the IR optimization passes, but this may not
be the only reason.
--
Arnaud de Grandmaison
More information about the llvm-dev
mailing list