[PATCH] D46283: [AArch64] Set vectorizer-maximize-bandwidth as default true
Adhemerval Zanella via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Apr 30 13:13:05 PDT 2018
zatrazz created this revision.
zatrazz added reviewers: fhahn, rengolin, javed.absar, huntergr, SjoerdMeijer, t.p.northover, echristo, evandro.
Herald added a subscriber: kristof.beyls.
Although this shows no virtual gain in speccpu2006 on A72:
Benchmark Diff
400.perlbench +1.55
401.bzip2 -1.22
403.gcc +0.73
429.mcf +3.00
445.gobmk -0.39
456.hmmer -0.90
458.sjeng -0.41
462.libquantum -1.91
464.h264ref 0.00
471.omnetpp -0.64
473.astar -0.38
483.xalancbmk 0.90
geomean: 0.04
It shows some good improvements in generic loops code where each
element is truncate to a narrow type. For instance vector body for
the following code:
void store_i32_to_i8 (const int *src, int width, unsigned char *dst)
{
for (int i = 0; i < width; i++) {
*dst++ = *src++;
}
}
It currently compiled to:
---
.LBB0_4: // %vector.body
// =>This Inner Loop Header: Depth=1
ldp w14, w15, [x11, #-4]
add x11, x11, #8 // =8
subs x13, x13, #2 // =2
sturb w14, [x12, #-1]
strb w15, [x12], #2
b.ne .LBB0_4
---
Where with current patch it is now compiled to:
---
.LBB0_4: // %vector.body
// =>This Inner Loop Header: Depth=1
ldp q0, q1, [x11, #-64]
ldp q2, q3, [x11, #-32]
ldp q4, q5, [x11]
ldp q6, q7, [x11, #32]
xtn v0.4h, v0.4s
xtn v2.4h, v2.4s
xtn2 v2.8h, v3.4s
xtn2 v0.8h, v1.4s
xtn v6.4h, v6.4s
xtn v4.4h, v4.4s
xtn v0.8b, v0.8h
xtn2 v0.16b, v2.8h
xtn2 v6.8h, v7.4s
xtn2 v4.8h, v5.4s
xtn v1.8b, v4.8h
xtn2 v1.16b, v6.8h
add x11, x11, #128 // =128
subs x13, x13, #32 // =32
stp q0, q1, [x12, #-16]
add x12, x12, #32 // =32
b.ne .LBB0_4
---
It is a increase of about 12% of throughput in a micro-benchmark with an array of
16777216 elements.
Repository:
rL LLVM
https://reviews.llvm.org/D46283
Files:
lib/Target/AArch64/AArch64TargetTransformInfo.h
test/Transforms/LoopVectorize/AArch64/aarch64-trunc-vec.ll
test/Transforms/LoopVectorize/AArch64/loop-vectorization-factors.ll
test/Transforms/LoopVectorize/AArch64/reduction-small-size.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D46283.144621.patch
Type: text/x-patch
Size: 6238 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20180430/48015f2f/attachment.bin>
More information about the llvm-commits
mailing list