[PATCH] D20315: [LV] For some induction variables, use vector phis instead of widening the scalar in the loop body
Michael Kuperstein via llvm-commits
llvm-commits at lists.llvm.org
Mon May 16 17:56:57 PDT 2016
mkuper created this revision.
mkuper added reviewers: delena, jmolloy, danielcdh.
mkuper added subscribers: llvm-commits, wmi, Ayal, davidxl.
Herald added a subscriber: mzolotukhin.
This changes the way we treat widening of induction variables.
In the existing code, whenever we need a widened IV, we widen the scalar IV on the fly, by splatting it and adding the step vector.
Instead, we can create a real vector IV, which tends to save a couple of instructions per iteration. This patch only changes the behavior in the most basic case - integer primary IVs with a constant step. If this looks sensible, I'll try to follow-up with the other cases.
It seems to be more or less performance neutral, but for basic cases the code looks better, so I have the feeling this is a step in the right direction.
To take the most trivial example:
```
void vec(unsigned int *a, unsigned int k) {
#pragma clang loop vectorize_width(4) interleave_count(1)
#pragma nounroll
for(unsigned int i = 0; i < k; ++i)
a[i] = i;
}
```
For AVX, without this patch, we get:
```
# BB#5:
xorl %ecx, %ecx
vmovdqa .LCPI0_0(%rip), %xmm0 # xmm0 = [0,1,2,3]
.p2align 4, 0x90
.LBB0_6: # =>This Inner Loop Header: Depth=1
vmovd %ecx, %xmm1
vpshufd $0, %xmm1, %xmm1 # xmm1 = xmm1[0,0,0,0]
vpaddd %xmm0, %xmm1, %xmm1
vmovdqu %xmm1, (%rdi,%rcx,4)
addq $4, %rcx
cmpq %rcx, %rdx
jne .LBB0_6
```
And with this patch:
```
# BB#5: # %vector.body.preheader
vmovdqa .LCPI0_0(%rip), %xmm1 # xmm1 = [0,1,2,3]
vmovdqa .LCPI0_1(%rip), %xmm0 # xmm0 = [4,4,4,4]
movq %rdi, %rcx
movq %r8, %rdx
.p2align 4, 0x90
.LBB0_6: # %vector.body
# =>This Inner Loop Header: Depth=1
vmovdqu %xmm1, (%rcx)
vpaddd %xmm0, %xmm1, %xmm1
addq $16, %rcx
addq $-4, %rdx
jne .LBB0_6
```
As this example shows, when we actually need the scalar IV, e.g. for a scalar GEP, InstCombine seems to clean things up nicely, so it doesn't look like LV needs to consider that.
Other views (especially on when this may be a bad thing) are welcome.
http://reviews.llvm.org/D20315
Files:
lib/Transforms/Vectorize/LoopVectorize.cpp
test/Transforms/LoopVectorize/PowerPC/vsx-tsvc-s173.ll
test/Transforms/LoopVectorize/X86/gather_scatter.ll
test/Transforms/LoopVectorize/cast-induction.ll
test/Transforms/LoopVectorize/gcc-examples.ll
test/Transforms/LoopVectorize/gep_with_bitcast.ll
test/Transforms/LoopVectorize/global_alias.ll
test/Transforms/LoopVectorize/induction_plus.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D20315.57415.patch
Type: text/x-patch
Size: 12414 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160517/c4ce8d43/attachment.bin>
More information about the llvm-commits
mailing list