[PATCH] Adjust the cost of vectorized SHL/SRL/SRA
Wei Mi
wmi at google.com
Fri May 22 10:03:47 PDT 2015
In http://reviews.llvm.org/D9923#177012, @aschwaighofer wrote:
> I share Simon's concerns. Please make sure that we still get a good estimate for kernels like (these are from the rdar mentioned in the commit).
>
> #define TYPE char
> #define OP >>
> #define SIZE 1024
> #define TYPE_ALIGN __attribute__((aligned(16)))
>
> TYPE A1[SIZE] TYPE_ALIGN;
> TYPE B1[SIZE] TYPE_ALIGN;
> TYPE C1[SIZE] TYPE_ALIGN;
>
> void kernel1() {
> for (int i = 0; i < SIZE; ++i) {
> A1[i] = B1[i] OP C1[i];
> }
>
>
> or:
>
> for(k=0, r=0; k<pos; k++)
> r += (MAX_UNSIGNED) 1 << k;
Thanks for sharing the testcase. For the first testcase:
Without the patch, the generated code for the kernel loop is:
.LBB0_1: # %for.body
1. =>This Inner Loop Header: Depth=1 movsbl B1+1024(%rax), %edx movb C1+1024(%rax), %cl sarl %cl, %edx movb %dl, A1+1024(%rax) incq %rax jne .LBB0_1
With the patch, the generated code for the kernel loop is:
.LBB0_1: # %vector.body
1. =>This Inner Loop Header: Depth=1 movd B1+1024(%rax), %xmm1 # xmm1 = mem[0],zero,zero,zero punpcklbw %xmm1, %xmm1 # xmm1 = xmm1[0,0,1,1,2,2,3,3,4,4,5,5,6,6,7,7] punpcklwd %xmm1, %xmm1 # xmm1 = xmm1[0,0,1,1,2,2,3,3] psrad $24, %xmm1 movd C1+1024(%rax), %xmm2 # xmm2 = mem[0],zero,zero,zero punpcklbw %xmm2, %xmm2 # xmm2 = xmm2[0,0,1,1,2,2,3,3,4,4,5,5,6,6,7,7] punpcklwd %xmm2, %xmm2 # xmm2 = xmm2[0,0,1,1,2,2,3,3] psrad $24, %xmm2 psrad %xmm2, %xmm1 pand %xmm0, %xmm1 packuswb %xmm1, %xmm1 packuswb %xmm1, %xmm1 movd %xmm1, A1+1024(%rax) addq $4, %rax jne .LBB0_1
The vectorized version is slightly better than the scalarized version. But the cost estimation to compute VF is not very good -- The cost estimation shows cost is 8 when VF==1 and cost is 2 when VF==4. The estimated costs of vectorized sext and trunc are too low and don't match the real costs.
Another problem is that vectorizer doesn't know the char->int type promotion here is unnecessary.
Can you give me the whole version of the second testcase? I am not sure my tweaked version is the right one.
REPOSITORY
rL LLVM
http://reviews.llvm.org/D9923
EMAIL PREFERENCES
http://reviews.llvm.org/settings/panel/emailpreferences/
More information about the llvm-commits
mailing list