[llvm] r190309 - [ARMv8] Prevent generation of deprecated IT blocks on ARMv8 in Thumb mode.

Hal Finkel hfinkel at anl.gov
Tue Dec 10 11:45:03 PST 2013


Jim,

Thanks! If you have another minute, let me quickly summarize my current understanding:

In IfConverter::ScanInstructions, pre-r190309, we had the following special case:

    if (BBI.ClobbersPred && !isPredicated) {
      // Predicate modification instruction should end the block (except for
      // already predicated instructions and end of block branches).
      if (isCondBr) {
        // A conditional branch is not predicable, but it may be eliminated.
        continue;
      }

      // Predicate may have been modified, the subsequent (currently)
      // unpredicated instructions cannot be correctly predicated.
      BBI.IsUnpredicable = true;
      return;
    }

    ...

    if (!TII->isPredicable(I)) {
      BBI.IsUnpredicable = true;
      return;
    }

and so, for an unpredicated conditional branch, in a block which clobbers the predicate, we would not check whether or not the branch could be predicated under some assumption that it will be later removed. I don't understand this, but as Artyom pointed out to me, the relevant code is in IfConverter::IfConvertTriangle:

  if (CvtBBI->BB->pred_size() > 1) {
    ...
  } else {
    // Predicate the 'true' block after removing its branch.
    CvtBBI->NonPredSize -= TII->RemoveBranch(*CvtBBI->BB);
    PredicateBlock(*CvtBBI, CvtBBI->BB->end(), Cond, Redefs);

    // Now merge the entry of the triangle with the true block.
    BBI.NonPredSize -= TII->RemoveBranch(*BBI.BB);
    MergeBlocks(BBI, *CvtBBI, false);
  }

then in r190309 this special case was expanded to include *all* conditional branches (whether or not they are in predicate-clobbering blocks). This is what triggers the self-hosting failure on PowerPC. For example, we now end up with code like this:

        ...
.LBB0_6:                                # %if.end7.i37
        lbz 4, 0(3)
        li 30, 1
        cmplwi 0, 4, 0
        beq 0, .LBB0_15
# BB#7:                                 # %if.then9.i39
        bne 1, .LBB0_14
        b .LBB0_15
.LBB0_8:                                # %if.end
        ...

    BB6
   |  \
   |  BB7
   |   / \
   |  /   \
  BB15    BB14

being transformed into:

        ...
.LBB0_6:                                # %if.end7.i37
        lbz 4, 0(3)
        li 30, 1
        cmplwi 0, 4, 0
        bne 1, .LBB0_13
        b .LBB0_14
.LBB0_7:                                # %if.end
        ...

(adjusting for the later block numbering change)
    BB6
   | \
   |  \
  BB15 BB14

but without combining the conditions from BB6 and BB7 (but instead just eliminated one). Now if the "bne 1, .LBB0_14" in BB7 were actually predicated on the "(eq, cr0)" value, then this would all make sense (but it is not, and can't be on this architecture). TII->isPredicable on this instruction returns false (which I think should prevent this problem in the first place, but does not, because of this special case for conditional branches).

My fear is that the special case pre-r190309 was also wrong for PowerPC, and that by restoring it, we're just reinstating an even-more-subtle bug for non-ARM backends.

 -Hal

----- Original Message -----
> From: "Jim Grosbach" <grosbach at apple.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "llvm-commits" <llvm-commits at cs.uiuc.edu>, "Bill Wendling" <isanbard at gmail.com>, "Artyom Skrobov"
> <Artyom.Skrobov at arm.com>
> Sent: Tuesday, December 10, 2013 12:47:23 PM
> Subject: Re: [llvm] r190309 - [ARMv8] Prevent generation of deprecated IT blocks	on ARMv8 in Thumb mode.
> 
> Hi Hal,
> 
> I’m not very familiar with the patch this thread is in response to,
> so I can’t comment very effectively on that aspect of things
> directly.
> 
> Your comment below about multiple branch instructions at the end of a
> basic block, however, is correct. The IfConversion pass needs to
> handle those sorts of cases, including degenerate cases where
> AnalyzeBranch() fails (by punting and not doing if-conversion of
> that).
> 
> There are helper functions in the ARM backend (and I assume others)
> which deal with getting the inverted condition code for
> if-converting if/else constructs. Perhaps there’s something wrong
>   there either in the general pass or the PPC backend? I seem to
> recall there was a nasty bug in the ARM version a while back that
> was painful to track down.
> 
> -Jim
> 
> On Dec 10, 2013, at 10:24 AM, Hal Finkel <hfinkel at anl.gov> wrote:
> 
> > Jim, as you probably saw, Bill recommended you as someone who could
> > help clear up the confusion here. I would appreciate your input; I
> > need to get this cleared up this week (given our current release
> > schedule).
> > 
> > Thanks in advance,
> > Hal
> > 
> > ----- Original Message -----
> >> From: "Hal Finkel" <hfinkel at anl.gov>
> >> To: "Artyom Skrobov" <Artyom.Skrobov at arm.com>
> >> Cc: llvm-commits at cs.uiuc.edu
> >> Sent: Tuesday, December 3, 2013 11:02:29 AM
> >> Subject: Re: [llvm] r190309 - [ARMv8] Prevent generation of
> >> deprecated IT	blocks	on ARMv8 in Thumb mode.
> >> 
> >> ----- Original Message -----
> >>> From: "Artyom Skrobov" <Artyom.Skrobov at arm.com>
> >>> To: "Hal Finkel" <hfinkel at anl.gov>
> >>> Cc: "Joey Gouly" <Joey.Gouly at arm.com>, llvm-commits at cs.uiuc.edu,
> >>> "Tim Northover" <t.p.northover at gmail.com>, "Evan
> >>> Cheng" <evan.cheng at apple.com>
> >>> Sent: Monday, December 2, 2013 12:23:27 PM
> >>> Subject: RE: [llvm] r190309 - [ARMv8] Prevent generation of
> >>> deprecated IT	blocks	on ARMv8 in Thumb mode.
> >>> 
> >>> Hal, could you please clarify why the PPC backend produces
> >>> BasicBlocks with several branch instructions sharing the same BB?
> >>> 
> >>> The doxygen comments in LLVM imply that a BB is a "single entry,
> >>> single exit" sequence, and any branch instruction in a BB must
> >>> terminate it.
> >> 
> >> That seems mostly right, but a block can have multiple terminators
> >> (if one is conditional and one is unconditional), and the
> >> unconditional branch may be elided if its destination is the next
> >> block in layout ordering. Taking a quick look at AnalyzeBranch in
> >> the various backends, I believe that this is true also on X86,
> >> Hexagon, Sparc, SystemZ, etc. This allows forming a TBB/FBB pair
> >> using two consecutive branches.
> >> 
> >> Is this somehow incompatible with the design of the if conversion
> >> pass?
> >> 
> >> Thanks again,
> >> Hal
> >> 
> >>> 
> >>> As you can see in the code examples I posted earlier, this is
> >>> indeed
> >>> the case with the ARM backend.
> >>> 
> >>> 
> >>> -----Original Message-----
> >>> From: Hal Finkel [mailto:hfinkel at anl.gov]
> >>> Sent: 28 November 2013 19:50
> >>> To: Artyom Skrobov
> >>> Cc: Joey Gouly; llvm-commits at cs.uiuc.edu; Tim Northover; Evan
> >>> Cheng
> >>> Subject: Re: [llvm] r190309 - [ARMv8] Prevent generation of
> >>> deprecated IT blocks on ARMv8 in Thumb mode.
> >>> 
> >>> 
> >>> Thanks for the explanation! I still don't understand, however,
> >>> under
> >>> what set of assumptions this is generally valid. In the test case
> >>> showing the problem with the PPC backend, we have:
> >>> 
> >>>        ...
> >>> .LBB0_6:                                # %if.end7.i37
> >>>        lbz 4, 0(3)
> >>>        li 30, 1
> >>>        cmplwi 0, 4, 0
> >>>        beq 0, .LBB0_15
> >>> # BB#7:                                 # %if.then9.i39
> >>>        bne 1, .LBB0_14
> >>>        b .LBB0_15
> >>> .LBB0_8:                                # %if.end
> >>>        ...
> >>> 
> >>>    BB6
> >>>   |  \
> >>>   |  BB7
> >>>   |   / \
> >>>   |  /   \
> >>>  BB15    BB14
> >>> 
> >>> being transformed into:
> >>> 
> >>>        ...
> >>> .LBB0_6:                                # %if.end7.i37
> >>>        lbz 4, 0(3)
> >>>        li 30, 1
> >>>        cmplwi 0, 4, 0
> >>>        bne 1, .LBB0_13
> >>>        b .LBB0_14
> >>> .LBB0_7:                                # %if.end
> >>>        ...
> >>> 
> >>> (adjusting for the later block numbering change)
> >>>    BB6
> >>>   | \
> >>>   |  \
> >>>  BB15 BB14
> >>> 
> >>> but it did so without combining the conditions from BB6 and BB7
> >>> (but
> >>> instead just eliminated one).
> >>> 
> >>> I don't understand how this could be generally valid. It looks
> >>> like
> >>> what is happening is that, in the (current) broken state,
> >>> ScanInstructions returns true on BB7 (because it can predicate
> >>> the
> >>> unconditional branch, and it skips checking the conditional
> >>> branch),
> >>> and then tries to predicate BB7. BB7 has nothing in it, however,
> >>> except for its branches, and so nothing predicates. Now if the
> >>> "bne
> >>> 1, .LBB0_14" in BB7 were actually predicated on the "(eq, cr0)"
> >>> value, then this would all make sense (but it is not, and can't
> >>> be
> >>> on this architecture). FWIW, PPCInstrInfo::PredicateInstruction
> >>> is
> >>> never called on the "bne 1, .LBB0_14" instruction (if it were,
> >>> then
> >>> this would assert instead of being a misompile).
> >>> 
> >>> Artyom, can you explain how the ARM equivalent of this leads to
> >>> correct code?
> >>> 
> >>> Evan, as I believe that you wrote the original code in question
> >>> --
> >>> 6.5 years ago ;) -- perhaps you'll be able to lend some insight
> >>> here?
> >>> 
> >>> Thanks again,
> >>> Hal
> >>> 
> >>> 
> >>> 
> >>> 
> >>> 
> >>> 
> >> 
> >> --
> >> Hal Finkel
> >> Assistant Computational Scientist
> >> Leadership Computing Facility
> >> Argonne National Laboratory
> >> _______________________________________________
> >> llvm-commits mailing list
> >> llvm-commits at cs.uiuc.edu
> >> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> >> 
> > 
> > --
> > Hal Finkel
> > Assistant Computational Scientist
> > Leadership Computing Facility
> > Argonne National Laboratory
> 
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory




More information about the llvm-commits mailing list