[PATCH] D81765: Don't inline dynamic allocas that simplify to huge static allocas.

Tue Jun 23 21:03:33 PDT 2020

jfb added inline comments.

================
Comment at: llvm/lib/Analysis/InlineCost.cpp:845
+      // a threshold.
+      if (AllocSize->getLimitedValue() >
+          InlineConstants::MaxSimplifiedDynamicAllocaToMove) {
----------------
mtrofin wrote:
> aemerson wrote:
> > mtrofin wrote:
> > > I think that the decision here needs to be mindful of profiling information (if present). The hypothesis in the patch is that the call site may never be executed. What if we know (from profiling) it is always executed?
> > > 
> > > Also, could you provide more insight into the scenario generating this - thanks!
> > > What if we know (from profiling) it is always executed?
> > IMO then that function is simply allocating a huge amount on the stack. I see that in a similar way to TotalAllocaSizeRecursiveCaller being triggered, in that case we don't want to inline either, no matter what profiling information says, right?
> > 
> > > Also, could you provide more insight into the scenario generating this - thanks!
> > This is code generated as a result of the `-ftrivial-auto-var-init=pattern` option, which combined with user code that assumes that VLAs will not be speculatively created with -1 (2^32) size in the entry block. If you run the test case in this patch you will see the resulting alloca which is guaranteed to crash on any reasonable platform.
> If profiling indicates the callsite is hot, then we know that a) there's no runtime problem, and b) we would want to inline. Yes, that can then surface the problem because of the behavior in InlineFunction, my point is to illustrate that it's not clear cut we wouldn't want to inline.
I think profiling is important to take into account, but I think you're wrong on one point: say we have this function:
```
void f(int a, int b) {
  if (a < 1000)
    g();
  if (b < 1000)
    h();
}
```
Imagine profiling shows both are hot. Imagine `g` and `h` have very large stacks. We'll now inline them, hoist them to the top of `f`, and in many cases fail to color the `alloca`s separately.

What you said is: "profiling shows there's no runtime problem". That's only true without inlining! With inlining, if the optimizer falls on its face, there's a problem.

The bug I describe isn't exactly the one @aemerson is fixing, but it's connected, and one @MadCoder and friends run into *very frequently*.

If we back up: Amara is trying to fix one instance of a quite unfortunate bug. That won't fix the entire problem with LLVM inlining `alloca` and then making a mess of it. But I hope it helps. Let's focus on avoiding what it might regress, and as a follow-up I think we all agree there's much more to be done.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81765/new/

https://reviews.llvm.org/D81765