[PATCH] D34311: [InstCombine] Don't replace allocas with globals if we can't prove that it's large enough.
Vitaly Buka via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jun 20 14:11:22 PDT 2017
vitalybuka marked 2 inline comments as done.
vitalybuka added a comment.
In https://reviews.llvm.org/D34311#785745, @efriedma wrote:
> Hmm, actually, thinking about it a bit more, I'm not sure this patch is right.
>
> If you were to assume that out-of-bounds loads return undef rather than have undefined behavior, this patch would be unnecessary. There's only one memcpy that writes to the alloca. That memcpy writes N known bytes to the alloca. Any other instruction which accesses the alloca is either reading one of those N bytes, or returns undef, so the returned values are consistent with reading directly from the global.
We have problem with code like this which just crashes even without sanitizers:
const char KKK[] = {1, 1, 2};
char bbb[100000];
void test2() {
char cc[sizeof(bbb)];
memcpy(cc, KKK , sizeof(KKK));
memcpy(bbb, cc, sizeof(bbb));
}
> Of course, that isn't the world we live in; we can't speculate out-of-bounds loads. Therefore, the safety check needs to make sure we aren't introducing any such loads. There are a couple of ways to do that:
>
> 1. Prove that the constant pointer points to at least `sizeof(alloca type)` bytes of memory. The loads in the rewritten code are out-of-bounds only if they were out-of-bounds in the original code.
Yes, but seems this requires more work, and I will have no time to do that any time soon.
> 2. Ensure that every load is dominated by the initializing memcpy, and the memcpy copies at least `sizeof(alloca type)` bytes of memory. If the memcpy is copying too many bytes, it's immediately undefined behavior. Any load dominated by the memcpy is one of the following: in-bounds in the original code, out-of-bounds in the original code, or dominated by undefined behavior.
>
> This patch implements (2)... but only partially: it's missing the dominance check. Testcase something like this:
>
> ``` const char x[1] = { 3 }; char z[10]; int f(bool this_is_false) { char y[10]; if (this_is_false) memcpy(y, x, 10); memcpy(z, y, 10); } ```
This is probably separate issue. And I am not sure if understand the problem. If we get to "memcpy(z, y, 10);" without "memcpy(y, x, 10);" I'd expect we don't care if "y" is uninitialized bytes or global constant. We will have no buffer overflow which I am trying to fix.
https://reviews.llvm.org/D34311
More information about the llvm-commits
mailing list