[PATCH] D34311: [InstCombine] Don't replace allocas with globals if we can't prove that it's large enough.

Vitaly Buka via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Jun 20 14:11:22 PDT 2017


vitalybuka marked 2 inline comments as done.
vitalybuka added a comment.

In https://reviews.llvm.org/D34311#785745, @efriedma wrote:

> Hmm, actually, thinking about it a bit more, I'm not sure this patch is right.
>
> If you were to assume that out-of-bounds loads return undef rather than have undefined behavior, this patch would be unnecessary.  There's only one memcpy that writes to the alloca.  That memcpy writes N known bytes to the alloca.  Any other instruction which accesses the alloca is either reading one of those N bytes, or returns undef, so the returned values are consistent with reading directly from the global.


We have problem with code like this which just crashes even without sanitizers:

  const char KKK[] = {1, 1, 2};
  char bbb[100000];
  
  void test2() {
    char cc[sizeof(bbb)];
    memcpy(cc, KKK , sizeof(KKK));
    memcpy(bbb, cc, sizeof(bbb));
  }

> Of course, that isn't the world we live in; we can't speculate out-of-bounds loads.  Therefore, the safety check needs to make sure we aren't introducing any such loads.  There are a couple of ways to do that:
> 
> 1. Prove that the constant pointer points to at least `sizeof(alloca type)` bytes of memory.  The loads in the rewritten code are out-of-bounds only if they were out-of-bounds in the original code.

Yes, but seems this requires more work, and I will have no time to do that any time soon.

> 2. Ensure that every load is dominated by the initializing memcpy, and the memcpy copies at least `sizeof(alloca type)` bytes of memory.  If the memcpy is copying too many bytes, it's immediately undefined behavior.  Any load dominated by the memcpy is one of the following: in-bounds in the original code, out-of-bounds in the original code, or dominated by undefined behavior.
> 
>   This patch implements (2)... but only partially: it's missing the dominance check.  Testcase something like this:
> 
>   ``` const char x[1] = { 3 }; char z[10]; int f(bool this_is_false) { char y[10]; if (this_is_false) memcpy(y, x, 10); memcpy(z, y, 10); } ```

This is probably separate issue. And I am not sure if understand the problem. If we get to "memcpy(z, y, 10);" without "memcpy(y, x, 10);"  I'd expect we don't care if "y" is uninitialized bytes or global constant. We will have no buffer overflow which I am trying to fix.


https://reviews.llvm.org/D34311





More information about the llvm-commits mailing list