[LLVMdev] Folding an insertelt chain

Ivan Llopard ivanllopard at gmail.com
Fri Feb 17 05:59:42 PST 2012


Hello Chris, Duncan,

This is a small test case I checked on:

define <2 x i16> @iveltfold(<2 x i16> %s1) nounwind {
entry:
   %0 = extractelement <2 x i16> %s1, i32 0
   %conv = sext i16 %0 to i32
   %mul = mul nsw i32 %conv, 5793
   %conv1 = trunc i32 %mul to i16
   %1 = insertelement <2 x i16> %s1, i16 %conv1, i32 0
   %2 = extractelement <2 x i16> %1, i32 1
   %conv2 = sext i16 %2 to i32
   %mul3 = mul nsw i32 %conv2, 5793
   %conv4 = trunc i32 %mul3 to i16
   %3 = insertelement <2 x i16> %1, i16 %conv4, i32 1
   ret <2 x i16> %3
}

the insertelement chain is replaced by one build_vector node. I have a 
custom BE but I've checked it also using the x86 one and I had the 
following results:

$llc -march=x86 iveltfold.ll -o -

Before patch

     .globl    iveltfold
     .align    16, 0x90
     .type    iveltfold, at function
iveltfold:                              # @iveltfold
# BB#0:                                 # %entry
     movd    %xmm0, %eax
     imull    $5793, %eax, %ecx       # imm = 0x16A1
     pextrw    $4, %xmm0, %eax
     pinsrd    $0, %ecx, %xmm0
     imull    $5793, %eax, %eax       # imm = 0x16A1
     pinsrd    $2, %eax, %xmm0
     ret
.Ltmp0:
     .size    iveltfold, .Ltmp0-iveltfold


     .section    ".note.GNU-stack","", at progbits

After patch

     .globl    iveltfold
     .align    16, 0x90
     .type    iveltfold, at function
iveltfold:                              # @iveltfold
# BB#0:                                 # %entry
     pextrw    $4, %xmm0, %eax
     imull    $5793, %eax, %eax       # imm = 0x16A1
     movd    %xmm0, %ecx
     imull    $5793, %ecx, %ecx       # imm = 0x16A1
     movd    %ecx, %xmm0
     pinsrd    $2, %eax, %xmm0
     ret
.Ltmp0:
     .size    iveltfold, .Ltmp0-iveltfold


     .section    ".note.GNU-stack","", at progbits


Ivan

Le 17/02/2012 10:19, Chris Lattner a écrit :
> On Feb 17, 2012, at 12:50 AM, Ivan Llopard wrote:
>
>> Hello,
>>
>> I've added a little combining operation in DAGCombiner to fold a chain of insertelt nodes if that chain is proved to fully overwrite the very first source vector. In which case, I supposed a build_vector is better. It seems to be safe but I don't know if it is correctly implemented or if it is already done somewhere else. Please find attached the patch.
> Hi Ivan,
>
> This needs a testcase.
>
> -Chris



More information about the llvm-dev mailing list