[PATCH] D102748: [LoopUnroll] Don't unroll before vectorisation
Florian Hahn via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed May 19 04:22:54 PDT 2021
fhahn added a comment.
In D102748#2768113 <https://reviews.llvm.org/D102748#2768113>, @SjoerdMeijer wrote:
> All with the same result. So in a way this is an advertisement for skipping the fully unroller early. But like I said, I understand the point, and it was not my intention to skip fully unrolling, I just wanted it after the loop vectoriser.
> Also, I was expecting that if was a terrible idea, I would have expected this to be flagged up by SPEC as it contains some different codes; but fair enough, I have run only SPEC and the embedded benchmarks.
Fair enough, this is one of the simple cases where the backend picks up the slack from the middle-end (as @nikic mentioned), but I think we should focus on the IR we hand off to the backend, because the backend won't be able to optimize slightly more complex variations.
With a few small tweaks to the example, the backend is not able to pick up the slack (at least AArch64):
#include <string.h>
void use(char *);
void foo(int x) {
char Ptr[16];
memset(Ptr, 0, 16);
if (x == 20)
Ptr[5] = 10;
for (unsigned i = 0; i < 16; i++ )
Ptr[i] = i+1;
use(&Ptr[0]);
}
Another different example that should also generate worse assembly:
#include <string.h>
void f(char*);
void bar();
void test(char *Ptr, int x) {
char Foo[16];
memset(Foo, 0, 16);
for (unsigned i = 0; i < 16; i++ ) {
Foo[i] = i+1;
bar();
}
for (unsigned i = 0; i < 16; i++ ) {
Ptr[i] = Foo[i] + 2;
}
}
Those are just a few variations focused on DSE. I'd expect that similar issues exist for other passes, like GVN, InstCombine & co.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D102748/new/
https://reviews.llvm.org/D102748
More information about the llvm-commits
mailing list