[llvm-dev] Opportunity to split store of shuffled vector.

Thu Sep 26 02:53:41 PDT 2019

Hi there,

I notice that LLVM seems to always generate vector instructions for
vector operations in C, even it's just simple stores:

void foo(vector int* c) {
  (*c)[0] = 1;
  (*c)[1] = 2;
}

%0 = load <4 x i32>, <4 x i32>* %c, align 16
%vecins1 = shufflevector <4 x i32> <i32 1, i32 2, i32 undef, i32
undef>, <4 x i32> %0, <4 x i32> <i32 0, i32 1, i32 6, i32 7>
store <4 x i32> %vecins1, <4 x i32>* %c, align 16

But GCC generates two direct stores to their address, just like
arrays, which should be better on PowerPC. (Some other platforms would
benefit, also) So we can transform above IR to:

%0 = getelementptr inbounds <4 x i32>, <4 x i32>* %c, i64 0, i64 0
store i32 1, i32* %0, align 4
%1 = getelementptr <4 x i32>, <4 x i32>* %c, i64 0, i64 1
store i32 2, i32* %1, align 4

This could be an optimization opportunity, and I guess we can get it
done at InstCombine. But I'm not sure if there's any better place to
do it, since what it does is just like an 'inverse operation' of
vectorization. Also, there might be some other concerns I've not
noticed.

Looking forward to get any comments. Thanks.

Regards,
Qiu Chaofan