[PATCH] D76248: Fix a bug in the legacy inliner that causes subseqeunt double inlining.

Mon Mar 16 13:07:54 PDT 2020

hoyFB created this revision.
Herald added subscribers: llvm-commits, hiraditya.
Herald added a project: LLVM.
hoyFB edited the summary of this revision.
hoyFB added a reviewer: wenlei.
hoyFB edited the summary of this revision.

A recent change in the instruction simplifier enables a call to a function that just returns one of its parameter to be simplified as simply loading the parameter. This exposes a bug in the legacy inliner where double inlining may be involved which in turn may cause compiler ICE when an already-inlined callsite is reused for further inlining.
To put it simply, in the following-like C program, when the function call second(t) is inlined, its code t = third(t) will be reduced to just loading the return value of the callsite first(). This causes the inliner internal data structure to register the first() callsite for the call edge representing the third() call, therefore incurs a double inlining when both call edges are considered an inline candidate. I'm making a fix to break the inliner from reusing a callsite for new call edges.

  void top()
  {
      int t = first();
      second(t);
  }

  void second(int t)
  {
     t = third(t);
     fourth(t);
  }

  void third(int t)
  {
     return t;
  }

The actual failing case is much trickier than the example here and is only reproducible with the legacy inliner. The way the legacy inliner works is to process each SCC in a bottom-up order. That means in reality function first may be already inlined into top, or function third is either inlined to second or is folded into nothing. To repro the failure seen from building a large application, we need to figure out a way to confuse the inliner so that the bottom-up inlining is not fulfilled. I'm doing this by making the second call indirect so that the alias analyzer fails to figure out the right call graph edge from top to second and top can be processed before second during the bottom-up.  We also need to tweak the test code so that when the inlining of top happens, the function body of second is not that optimized, by delaying the pass of function attribute deducer (i.e, which tells function third has no side effect and just returns its parameter). Since the CGSCC pass is iterative, additional calls are added to top to postpone the inlining of second to the second round right after the first function attribute deducing pass is done.

Note that this fix could introduce a side effect that blocks the simplification of inlined code, specifically for a call site that can be folded to another call site. I hope this can probably be complemented by subsequent inlining or folding, as shown in the attached unit test. The ideal fix should be to separate the use of VMap. However, in reality this failing pattern shouldn't happen often. And even if it happens, there should be a good chance that the non-folded call site will be refolded by iterative inlining or subsequent simplification.

Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D76248

Files:
  llvm/lib/Transforms/Utils/CloneFunction.cpp
  llvm/test/Transforms/Inline/inline_call.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D76248.250616.patch
Type: text/x-patch
Size: 3695 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20200316/ae8233dd/attachment.bin>