fix for Ruby build fails due to inlining functions with block label address taken

Gerolf Hoflehner ghoflehner at apple.com
Tue Jul 1 11:49:12 PDT 2014


Ok. 

I re-enabled the test for 64b and local arrays yesterday, but the commit message is with a moderator since it was my first commit to clang.

On Jun 30, 2014, at 10:04 PM, Bob Wilson <bob.wilson at apple.com> wrote:

> I disabled the test in clang r212091.
> 
> Gerolf, please don’t ever leave tests failing like that.
> 
>> On Jun 30, 2014, at 6:07 PM, Eric Christopher <echristo at gmail.com> wrote:
>> 
>> On Mon, Jun 30, 2014 at 6:00 PM, Bob Wilson <bob.wilson at apple.com> wrote:
>>> Thanks. You also need to commit a change to clang’s indirect-goto test,
>>> right? Otherwise, it’s going to start failing now.
>> 
>> s/start failing/continue to fail :)
>> 
>> -eric
>> 
>>> 
>>> On Jun 30, 2014, at 5:41 PM, Gerolf Hoflehner <ghoflehner at apple.com> wrote:
>>> 
>>> Thanks, Bob.  Committed revision 212077. Please see below.
>>> 
>>> On Jun 30, 2014, at 3:18 PM, Bob Wilson <bob.wilson at apple.com> wrote:
>>> 
>>> 
>>> On Jun 27, 2014, at 4:26 PM, Gerolf Hoflehner <ghoflehner at apple.com> wrote:
>>> 
>>> I revised the comments and commit message of the Ruby fix.
>>> 
>>> 
>>> <0001-Suppress-inlining-when-the-block-address-is-taken.patch>
>>> 
>>> 
>>> Thanks. This looks close. Please go ahead and commit with fixes for the
>>> following:
>>> 
>>> @@ -1097,6 +1094,14 @@ bool CallAnalyzer::analyzeCall(CallSite CS) {
>>>    BasicBlock *BB = BBWorklist[Idx];
>>>    if (BB->empty())
>>>      continue;
>>> +    // Disallow inlining of functions when the address of a label is taken.
>>> +    // Usually when a block address is taken there is also an indirect
>>> branch
>>> +    // which effectively suppresses inlining.  But when we partially inline
>>> an\
>>> d
>>> +    // only take a subset of the blocks into account there may not be an
>>> +    // indirect branch. Such code does not make much sense since a block
>>> +    // address label is allowed to be a jump target only within a function.
>>> +    // Worse, in one instance it caused a segmentation fault in a Ruby
>>> build.
>>> +    if (BB->hasAddressTaken()) return false;
>>> 
>>>    // Analyze the cost of this block. If we blow through the threshold,
>>> this
>>>    // returns false, and we can bail on out.
>>> 
>>> 
>>> You made some other changes since the last revision. Please put the return
>>> on a separate line, like you had it before.
>>> 
>>> Yikes, when I forced the 80 character alignment it moved the return too…
>>> 
>>> 
>>> I’m still not happy with the comment. The last part does not provide enough
>>> information to understand _why_ it caused the Ruby segfault, so it’s not
>>> really adding any value. It’s also not clear what you mean here by
>>> “partially inline”. Would you be OK to replace it with this?
>>> 
>>> “Disallow inlining a blockaddress. A blockaddress only has defined behavior
>>> for an indirect branch in the same function, and we do not currently support
>>> inlining indirect branches. But, the inliner may not see an indirect branch
>>> that ends up being dead code at a particular call site. If the blockaddress
>>> escapes the function, e.g., via a global variable, inlining may lead to an
>>> invalid cross-function reference.”
>>> 
>>> Happy to steel that treasure of a comment for the benefit of the community
>>> :-)
>>> 
>>> 
>>> @@ -1278,8 +1283,15 @@ bool InlineCostAnalysis::isInlineViable(Function &F)
>>> {
>>>    F.getAttributes().hasAttribute(AttributeSet::FunctionIndex,
>>>                                   Attribute::ReturnsTwice);
>>>  for (Function::iterator BI = F.begin(), BE = F.end(); BI != BE; ++BI) {
>>> -    // Disallow inlining of functions which contain an indirect branch.
>>> -    if (isa<IndirectBrInst>(BI->getTerminator()))
>>> +    // Disallow inlining of functions which contain an indirect branch.
>>> Also
>>> +    // disallow inlining of functions when the address of a label is taken.
>>> +    // Usually when a block address is taken there is also an indirect
>>> branch
>>> +    // which effectively suppresses inlining.  But when there is no
>>> indirect
>>> +    // branch the user wrote a function that takes the label address but
>>> does
>>> +    // not jump to it within the function. This does not make much sense
>>> and
>>> +    // could possibly be harmful (e.g. when the address is assigned a
>>> global
>>> +    // variable which is used in a global (cross-funtion) goto).
>>> +    if (isa<IndirectBrInst>(BI->getTerminator()) || BI->hasAddressTaken())
>>>      return false;
>>> 
>>>    for (BasicBlock::iterator II = BI->begin(), IE = BI->end(); II != IE;
>>> 
>>> 
>>> I still find this comment to be less than helpful. The problem you describe
>>> here has nothing to do with whether inlining is viable. You’re saying that
>>> we should not inline functions with invalid code. Why does inlining make it
>>> any worse? I’m not opposed to adding the check for hasAddressTaken(), but I
>>> would prefer to just have the comment say: "Disallow inlining of functions
>>> which contain indirect branches or block addresses.”
>>> 
>>> I like the brevity of your suggestion and went with it.
>>> 
>>> 
>>> 
>>> I continue to work on the indirect-goto.c case. The fix that would make it
>>> happen is inverting the
>>> order of invocation of global const prop and global optimizer:
>>> 
>>> Index: lib/Transforms/IPO/PassManagerBuilder.cpp
>>> ===================================================================
>>> --- lib/Transforms/IPO/PassManagerBuilder.cpp (revision 211729)
>>> +++ lib/Transforms/IPO/PassManagerBuilder.cpp (working copy)
>>> @@ -156,9 +156,9 @@
>>>  if (!DisableUnitAtATime) {
>>>    addExtensionsToPM(EP_ModuleOptimizerEarly, MPM);
>>> 
>>> -    MPM.add(createGlobalOptimizerPass());     // Optimize out global vars
>>> 
>>>    MPM.add(createIPSCCPPass());              // IP SCCP
>>> +    MPM.add(createGlobalOptimizerPass());     // Optimize out global vars
>>>    MPM.add(createDeadArgEliminationPass());  // Dead argument elimination
>>> 
>>>    MPM.add(createInstructionCombiningPass());// Clean up after IPCP & DAE
>>> 
>>> 
>>> When the global optimizer is invoked first, the indirect branch using the
>>> global array has not
>>> been eliminated. So the global can’t be eliminated. That is pretty straight
>>> forward and the pass swap
>>> looks reasonable but requires more scrutiny and careful testing. As
>>> additional benefit the global
>>> array would be removed from the IR.
>>> 
>>> 
>>> -Gerolf
>>> 
>>> On Jun 25, 2014, at 6:01 PM, Gerolf Hoflehner <ghoflehner at apple.com> wrote:
>>> 
>>> Bob and I discussed this a bit more. Interprocedural const prop eliminates
>>> the indirect branch. Then the labels. So when the
>>> inliner kicks in it does not “see" any of the issues that could block it.
>>> 
>>> However, with the patch there must still be some information about block
>>> label address taken kept around when a
>>> global  (static) array contains the label addresses. I verified that the
>>> test case would work when foo2
>>> also declared just a local array  const void *addrs[] = { &&L1, &&L2, &&L3,
>>> &&L4, &&L5 }.
>>> Optimistically it seems like we might catch the global case also without
>>> escape analysis,
>>> but I have not looked into that code yet.
>>> 
>>> -Gerolf
>>> 
>>> 
>>> 
>>> On Jun 25, 2014, at 4:00 PM, Bob Wilson <bob.wilson at apple.com> wrote:
>>> 
>>> Oh! It’s a front-end test. I didn’t even think to look there….
>>> 
>>> The test predates the introduction of indirect branches and has only had
>>> superficial changes since then. We used to translate computed gotos into
>>> switch statements. I’m still trying to figure out how this ever worked in
>>> the past.
>>> 
>>> On Jun 25, 2014, at 3:50 PM, Gerolf Hoflehner <ghoflehner at apple.com> wrote:
>>> 
>>> Looking at the history it looks like Chris L. committed the test case in
>>> tools/clang/test/CodeGen/indirect-goto.c to test
>>> block address labels.
>>> 
>>> I’m fine with the train of reasoning:
>>> there is an indirect branch in the source and we
>>> never attempt to inline such functions. Furthermore for inlining functions
>>> with block
>>> label address taken we would need escape analysis. We don’t have it. So
>>> let’s not inline.
>>> 
>>> It looks like everyone is signing up for
>>> “Expect no optimization when block address labels are used”.
>>> 
>>> Cheers
>>> Gerolf
>>> 
>>> 
>>> 
>>> 
>>> On Jun 25, 2014, at 2:49 PM, Eric Christopher <echristo at gmail.com> wrote:
>>> 
>>> On Wed, Jun 25, 2014 at 2:46 PM, Bob Wilson <bob.wilson at apple.com> wrote:
>>> 
>>> On Jun 25, 2014, at 2:30 PM, Gerolf Hoflehner <ghoflehner at apple.com> wrote:
>>> 
>>> I noticed one side-effect of this patch: we no longer optimize fully cases
>>> like indirect-goto.c test in CodeGen below.
>>> Since foo and foo2 are no longer inlined constant folding of the result in
>>> main no longer happens.
>>> 
>>> Does anyone have an issue with this?
>>> 
>>> 
>>> I don’t. We’ve never attempted to inline functions containing indirect
>>> branches. I’m surprised that your testcase works now, but to really do it
>>> correctly, we would need to check that there were no remaining uses of the
>>> blockaddress that might escape the function. I’m not aware of any code that
>>> does that check, and without it, it’s not a safe optimization.
>>> 
>>> 
>>> 
>>> Oh, wait, I see what the testcase is doing now. I'd thought it was
>>> inlining into something that had blockaddr rather than the other way
>>> around.
>>> 
>>> What does the commit for the testcase say about it?
>>> 
>>> Assuming it doesn't say anything enlightening then no, no issue here either.
>>> 
>>> -eric
>>> 
>>> Thanks
>>> Gerolf
>>> 
>>> // RUN: %clang_cc1 -triple i386-unknown-unknown -O3 -emit-llvm -o - %s |
>>> grep "ret i32 2520"
>>> 
>>> static int foo(unsigned i) {
>>> void *addrs[] = { &&L1, &&L2, &&L3, &&L4, &&L5 };
>>> int res = 1;
>>> 
>>> goto *addrs[i];
>>> L5: res *= 11;
>>> L4: res *= 7;
>>> L3: res *= 5;
>>> L2: res *= 3;
>>> L1: res *= 2;
>>> return res;
>>> }
>>> 
>>> static int foo2(unsigned i) {
>>> static const void *addrs[] = { &&L1, &&L2, &&L3, &&L4, &&L5 };
>>> int res = 1;
>>> 
>>> goto *addrs[i];
>>> L5: res *= 11;
>>> L4: res *= 7;
>>> L3: res *= 5;
>>> L2: res *= 3;
>>> L1: res *= 2;
>>> return res;
>>> }
>>> 
>>> int main() {
>>> return foo(3)+foo2(4);
>>> }
>>> 
>>> 
>>> IR before:
>>> 
>>> ; Function Attrs: nounwind readnone
>>> define i32 @main() #0 {
>>> entry:
>>> ret i32 2520
>>> }
>>> 
>>> 
>>> IR now:
>>> 
>>> ; Function Attrs: nounwind readnone
>>> define i32 @main() #0 {
>>> entry:
>>> %call = tail call fastcc i32 @foo(i32 3)
>>> %call1 = tail call fastcc i32 @foo2(i32 4)
>>> %add = add nsw i32 %call1, %call
>>> ret i32 %add
>>> }
>>> 
>>> ; Function Attrs: nounwind readnone
>>> define internal fastcc i32 @foo(i32 %i) #0 {
>>> entry:
>>> br label %L1
>>> 
>>> L1:                                               ; preds = %entry
>>> ret i32 210
>>> }
>>> 
>>> ; Function Attrs: nounwind readnone
>>> define internal fastcc i32 @foo2(i32 %i) #0 {
>>> entry:
>>> br label %L1
>>> 
>>> L1:                                               ; preds = %entry
>>> ret i32 2310
>>> }
>>> 
>>> 
>>> On Jun 25, 2014, at 11:32 AM, Bob Wilson <bob.wilson at apple.com> wrote:
>>> 
>>> 
>>> On Jun 25, 2014, at 10:47 AM, Gerolf Hoflehner <ghoflehner at apple.com> wrote:
>>> 
>>> Hi
>>> 
>>> there have been reports on Ruby segfaults triggered by inlining functions
>>> with block label address taken like
>>> 
>>> 
>>> //Ruby code snippet
>>> vm_exec_core() {
>>> finish_insn_seq_0 = &&INSN_LABEL_finish;
>>> INSN_LABEL_finish:
>>> ;
>>> }
>>> 
>>> This kind of scenario can also happen when LLVM picks a subset of blocks for
>>> inlining,
>>> which is the case with the actual code in the Ruby environment.
>>> 
>>> Background:
>>> LLVM suppresses inlining for such functions when there is an indirect
>>> branch. The attached
>>> patch does so even when there is no indirect branch. Note that user code
>>> like above would not
>>> make much sense: using the global for jumping across function boundaries
>>> would be illegal.
>>> 
>>> Why is there a segfault:
>>> 
>>> In the snipped above the block with the label is recognized as dead. So it
>>> is eliminated. Instead
>>> of a block address the cloner stores a constant (sic!) into the global
>>> resulting in the segfault (when
>>> the global is used in a goto).
>>> 
>>> Why did it work in the past then:
>>> 
>>> By luck. In older versions vm_exec_core was also inlined but the label
>>> address used
>>> was the block label address in vm_exec_core. So the global jump ended up in
>>> the original function
>>> rather than in the caller which accidentally happened to work.
>>> 
>>> 
>>> -Gerolf
>>> 
>>> <0001-Suppress-inlining-when-the-block-address-is-taken.patch>
>>> 
>>> 
>>> The change is fine, but I still have more comments about comments….
>>> 
>>> Inlining function with block label address taken can cause
>>> many problem and requires a rich infrastructure to support
>>> including escape analysis. At this point the safest approach
>>> to address these problems is by blocking inlining from
>>> happening.
>>> 
>>> 
>>> Can you please put more of the context into the commit message? You’ve got
>>> some good background in the email above, and it would be useful to have that
>>> in the commit.
>>> 
>>> 
>>> rdar://17245966
>>> ---
>>> lib/Analysis/IPA/InlineCost.cpp        | 21 ++++++++++++++++++++-
>>> test/Transforms/Inline/blockaddress.ll |  5 +++--
>>> 2 files changed, 23 insertions(+), 3 deletions(-)
>>> 
>>> diff --git a/lib/Analysis/IPA/InlineCost.cpp
>>> b/lib/Analysis/IPA/InlineCost.cpp
>>> index 4536583..b079f46 100644
>>> --- a/lib/Analysis/IPA/InlineCost.cpp
>>> +++ b/lib/Analysis/IPA/InlineCost.cpp
>>> @@ -1097,6 +1097,16 @@ bool CallAnalyzer::analyzeCall(CallSite CS) {
>>>   BasicBlock *BB = BBWorklist[Idx];
>>>   if (BB->empty())
>>>     continue;
>>> +    // Disallow inlining of functions when the address of a label is
>>> +    // taken. Usually when a block address is taken there is also
>>> +    // an indirect branch which effectively suppresses inlining.
>>> +    // But when we partially inline and only take a subset of the blocks
>>> +    // into account there may not be an indirect branch. As cautionary
>>> +    // measure we suppress inlining when there is a block label address
>>> taken.
>>> +    // See also the comments to visitIndirectBrInst() and isInlineViable().
>>> +
>>> +    if (BB->hasAddressTaken())
>>> +      return false;
>>> 
>>> 
>>> Please remove the blank line between the comment and the code.
>>> 
>>> You should remove the comment sentence beginning with “As cautionary
>>> measure…”. It is not just a cautionary measure — it fixes a serious
>>> miscompile of Ruby.
>>> 
>>> I’m not sure it is necessary to cross-reference the comments in the other
>>> functions, but regardless, you should update the visitIndirectBrInst comment
>>> to remove this sentence: "And as a QOI issue, if someone is using a
>>> blockaddress without an indirectbr, and that reference somehow ends up in
>>> another function or global, we probably don't want to inline this function.”
>>> With your change here, that comment is no longer relevant.
>>> 
>>> 
>>>   // Analyze the cost of this block. If we blow through the threshold,
>>> this
>>>   // returns false, and we can bail on out.
>>> 
>>> @@ -1279,7 +1289,16 @@ bool InlineCostAnalysis::isInlineViable(Function &F)
>>> {
>>>                                  Attribute::ReturnsTwice);
>>> for (Function::iterator BI = F.begin(), BE = F.end(); BI != BE; ++BI) {
>>>   // Disallow inlining of functions which contain an indirect branch.
>>> -    if (isa<IndirectBrInst>(BI->getTerminator()))
>>> +    // Also disallow inlining of functions when the address of a label is
>>> +    // taken. Usually when a block address is taken there is also
>>> +    // an indirect branch which effectively suppresses inlining.
>>> +    // But when there is no indirect branch the user wrote a function
>>> +    // that takes the label address but does not jump to it within
>>> +    // the function. This does not make much sense and could point
>>> +    // to code that is not legal (eg. resulting in jump across function
>>> +    // boundaries). The check here at least prevents inlining of such code.
>>> +    // See also the comments to visitIndirectBrInst() and in analyzeCall().
>>> +    if (isa<IndirectBrInst>(BI->getTerminator()) || BI->hasAddressTaken())
>>>     return false;
>>> 
>>>   for (BasicBlock::iterator II = BI->begin(), IE = BI->end(); II != IE;
>>> 
>>> 
>>> This comment seems a little verbose to me. I would at least remove the
>>> cross-references to comments in other functions.
>>> 
>>> 
> 





More information about the llvm-commits mailing list