[PATCH] D8943: Calculate vectorization factor using the narrowest type instead of widest type
Cong Hou via llvm-commits
llvm-commits at lists.llvm.org
Tue Oct 13 14:48:08 PDT 2015
congh added a comment.
In http://reviews.llvm.org/D8943#260101, @spatel wrote:
> F909822: vecfactor - perf.csv <http://reviews.llvm.org/F909822>
>
> I applied this patch on top of r248957 and ran the benchmarking subset of test-suite on an AMD Jaguar 1.5 GHz + Ubuntu 14.04 test system. The baseline is -O3 -march=btver2 while the comparison run added -mllvm -vectorizer-maximize-bandwidth (data attached).
>
> I see very little performance difference on any test: almost everything is +/- 2% which is within the noise for most tests.
>
> Cong, I would be interested to know if you saw any large diffs on these tests on your test system or if the bigger wins/losses all occurred on the non-benchmarking tests in test-suite?
Thank you for the performance test! I think there may be two reasons that why we could not observe big performance difference in llvm test suite:
1. There is no hotspot that includes a loop with types of different sizes (this is what this patch is optimizing).
2. There are some problems with the cost model in llvm. Even we can choose a larger VF, the cost model shows that the larger VF has the larger cost. I will deal with this issue later.
I don't have a test in my codebase that benefits from this patch, but it is quite easy to synthesize one:
const int N = 1024 * 32;
int a[N];
char b[N];
int main() {
for (int i = 0; i < N; ++i) {
for (int i = 0; i < N; ++i) {
a[i]++;
b[i]++;
}
}
}
For the code shown above, the original running time is ~0.35s and with this patch the running time is reduced to ~0.228s.
http://reviews.llvm.org/D8943
More information about the llvm-commits
mailing list