[PATCH] D43256: [MBP] Move a latch block with conditional exit and multi predecessors to top of loop
Evgeniy via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jul 23 05:04:25 PDT 2019
ebrevnov added a comment.
Here is a C++ equivalent of my original code (which is actually java application) for you to reproduce.
> clang++ -c -O2 floatmin.cpp -march=skylake
extern float a[];
extern float b[];
extern float c[];
bool foo(int M, bool flag) {
for (int i = 0; i < M; i++) {
float x = a[i];
float y = b[i];
float min;
if (x != x) {
min = x; // a is NaN
}
else if (y == 0.0f) {
goto fail;
}
else {
min = (x <= y) ? x : y;
}
c[i] = min;
}
return true;
fail:
return false;
}
With C++ reproducer I can measure about 9% slowdown only. In this case CPI is identical (for the original test case I still don't know root cause of CPI difference) and all slowdown comes from increased path length due to one extra jump.
With this reproducer on hands you can gather profile data if needed. But that's a separate story. I don't think we can afford such a regression when profile is not available.
You probably could assume worst case if profile is not available but I believe it won't help in this and root cause is that heuristic just doesn't take extra jump into account.
Repository:
rL LLVM
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D43256/new/
https://reviews.llvm.org/D43256
More information about the llvm-commits
mailing list