[llvm] r190309 - [ARMv8] Prevent generation of deprecated IT blocks on ARMv8 in Thumb mode.
Hal Finkel
hfinkel at anl.gov
Tue Dec 10 11:45:03 PST 2013
Jim,
Thanks! If you have another minute, let me quickly summarize my current understanding:
In IfConverter::ScanInstructions, pre-r190309, we had the following special case:
if (BBI.ClobbersPred && !isPredicated) {
// Predicate modification instruction should end the block (except for
// already predicated instructions and end of block branches).
if (isCondBr) {
// A conditional branch is not predicable, but it may be eliminated.
continue;
}
// Predicate may have been modified, the subsequent (currently)
// unpredicated instructions cannot be correctly predicated.
BBI.IsUnpredicable = true;
return;
}
...
if (!TII->isPredicable(I)) {
BBI.IsUnpredicable = true;
return;
}
and so, for an unpredicated conditional branch, in a block which clobbers the predicate, we would not check whether or not the branch could be predicated under some assumption that it will be later removed. I don't understand this, but as Artyom pointed out to me, the relevant code is in IfConverter::IfConvertTriangle:
if (CvtBBI->BB->pred_size() > 1) {
...
} else {
// Predicate the 'true' block after removing its branch.
CvtBBI->NonPredSize -= TII->RemoveBranch(*CvtBBI->BB);
PredicateBlock(*CvtBBI, CvtBBI->BB->end(), Cond, Redefs);
// Now merge the entry of the triangle with the true block.
BBI.NonPredSize -= TII->RemoveBranch(*BBI.BB);
MergeBlocks(BBI, *CvtBBI, false);
}
then in r190309 this special case was expanded to include *all* conditional branches (whether or not they are in predicate-clobbering blocks). This is what triggers the self-hosting failure on PowerPC. For example, we now end up with code like this:
...
.LBB0_6: # %if.end7.i37
lbz 4, 0(3)
li 30, 1
cmplwi 0, 4, 0
beq 0, .LBB0_15
# BB#7: # %if.then9.i39
bne 1, .LBB0_14
b .LBB0_15
.LBB0_8: # %if.end
...
BB6
| \
| BB7
| / \
| / \
BB15 BB14
being transformed into:
...
.LBB0_6: # %if.end7.i37
lbz 4, 0(3)
li 30, 1
cmplwi 0, 4, 0
bne 1, .LBB0_13
b .LBB0_14
.LBB0_7: # %if.end
...
(adjusting for the later block numbering change)
BB6
| \
| \
BB15 BB14
but without combining the conditions from BB6 and BB7 (but instead just eliminated one). Now if the "bne 1, .LBB0_14" in BB7 were actually predicated on the "(eq, cr0)" value, then this would all make sense (but it is not, and can't be on this architecture). TII->isPredicable on this instruction returns false (which I think should prevent this problem in the first place, but does not, because of this special case for conditional branches).
My fear is that the special case pre-r190309 was also wrong for PowerPC, and that by restoring it, we're just reinstating an even-more-subtle bug for non-ARM backends.
-Hal
----- Original Message -----
> From: "Jim Grosbach" <grosbach at apple.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "llvm-commits" <llvm-commits at cs.uiuc.edu>, "Bill Wendling" <isanbard at gmail.com>, "Artyom Skrobov"
> <Artyom.Skrobov at arm.com>
> Sent: Tuesday, December 10, 2013 12:47:23 PM
> Subject: Re: [llvm] r190309 - [ARMv8] Prevent generation of deprecated IT blocks on ARMv8 in Thumb mode.
>
> Hi Hal,
>
> I’m not very familiar with the patch this thread is in response to,
> so I can’t comment very effectively on that aspect of things
> directly.
>
> Your comment below about multiple branch instructions at the end of a
> basic block, however, is correct. The IfConversion pass needs to
> handle those sorts of cases, including degenerate cases where
> AnalyzeBranch() fails (by punting and not doing if-conversion of
> that).
>
> There are helper functions in the ARM backend (and I assume others)
> which deal with getting the inverted condition code for
> if-converting if/else constructs. Perhaps there’s something wrong
> there either in the general pass or the PPC backend? I seem to
> recall there was a nasty bug in the ARM version a while back that
> was painful to track down.
>
> -Jim
>
> On Dec 10, 2013, at 10:24 AM, Hal Finkel <hfinkel at anl.gov> wrote:
>
> > Jim, as you probably saw, Bill recommended you as someone who could
> > help clear up the confusion here. I would appreciate your input; I
> > need to get this cleared up this week (given our current release
> > schedule).
> >
> > Thanks in advance,
> > Hal
> >
> > ----- Original Message -----
> >> From: "Hal Finkel" <hfinkel at anl.gov>
> >> To: "Artyom Skrobov" <Artyom.Skrobov at arm.com>
> >> Cc: llvm-commits at cs.uiuc.edu
> >> Sent: Tuesday, December 3, 2013 11:02:29 AM
> >> Subject: Re: [llvm] r190309 - [ARMv8] Prevent generation of
> >> deprecated IT blocks on ARMv8 in Thumb mode.
> >>
> >> ----- Original Message -----
> >>> From: "Artyom Skrobov" <Artyom.Skrobov at arm.com>
> >>> To: "Hal Finkel" <hfinkel at anl.gov>
> >>> Cc: "Joey Gouly" <Joey.Gouly at arm.com>, llvm-commits at cs.uiuc.edu,
> >>> "Tim Northover" <t.p.northover at gmail.com>, "Evan
> >>> Cheng" <evan.cheng at apple.com>
> >>> Sent: Monday, December 2, 2013 12:23:27 PM
> >>> Subject: RE: [llvm] r190309 - [ARMv8] Prevent generation of
> >>> deprecated IT blocks on ARMv8 in Thumb mode.
> >>>
> >>> Hal, could you please clarify why the PPC backend produces
> >>> BasicBlocks with several branch instructions sharing the same BB?
> >>>
> >>> The doxygen comments in LLVM imply that a BB is a "single entry,
> >>> single exit" sequence, and any branch instruction in a BB must
> >>> terminate it.
> >>
> >> That seems mostly right, but a block can have multiple terminators
> >> (if one is conditional and one is unconditional), and the
> >> unconditional branch may be elided if its destination is the next
> >> block in layout ordering. Taking a quick look at AnalyzeBranch in
> >> the various backends, I believe that this is true also on X86,
> >> Hexagon, Sparc, SystemZ, etc. This allows forming a TBB/FBB pair
> >> using two consecutive branches.
> >>
> >> Is this somehow incompatible with the design of the if conversion
> >> pass?
> >>
> >> Thanks again,
> >> Hal
> >>
> >>>
> >>> As you can see in the code examples I posted earlier, this is
> >>> indeed
> >>> the case with the ARM backend.
> >>>
> >>>
> >>> -----Original Message-----
> >>> From: Hal Finkel [mailto:hfinkel at anl.gov]
> >>> Sent: 28 November 2013 19:50
> >>> To: Artyom Skrobov
> >>> Cc: Joey Gouly; llvm-commits at cs.uiuc.edu; Tim Northover; Evan
> >>> Cheng
> >>> Subject: Re: [llvm] r190309 - [ARMv8] Prevent generation of
> >>> deprecated IT blocks on ARMv8 in Thumb mode.
> >>>
> >>>
> >>> Thanks for the explanation! I still don't understand, however,
> >>> under
> >>> what set of assumptions this is generally valid. In the test case
> >>> showing the problem with the PPC backend, we have:
> >>>
> >>> ...
> >>> .LBB0_6: # %if.end7.i37
> >>> lbz 4, 0(3)
> >>> li 30, 1
> >>> cmplwi 0, 4, 0
> >>> beq 0, .LBB0_15
> >>> # BB#7: # %if.then9.i39
> >>> bne 1, .LBB0_14
> >>> b .LBB0_15
> >>> .LBB0_8: # %if.end
> >>> ...
> >>>
> >>> BB6
> >>> | \
> >>> | BB7
> >>> | / \
> >>> | / \
> >>> BB15 BB14
> >>>
> >>> being transformed into:
> >>>
> >>> ...
> >>> .LBB0_6: # %if.end7.i37
> >>> lbz 4, 0(3)
> >>> li 30, 1
> >>> cmplwi 0, 4, 0
> >>> bne 1, .LBB0_13
> >>> b .LBB0_14
> >>> .LBB0_7: # %if.end
> >>> ...
> >>>
> >>> (adjusting for the later block numbering change)
> >>> BB6
> >>> | \
> >>> | \
> >>> BB15 BB14
> >>>
> >>> but it did so without combining the conditions from BB6 and BB7
> >>> (but
> >>> instead just eliminated one).
> >>>
> >>> I don't understand how this could be generally valid. It looks
> >>> like
> >>> what is happening is that, in the (current) broken state,
> >>> ScanInstructions returns true on BB7 (because it can predicate
> >>> the
> >>> unconditional branch, and it skips checking the conditional
> >>> branch),
> >>> and then tries to predicate BB7. BB7 has nothing in it, however,
> >>> except for its branches, and so nothing predicates. Now if the
> >>> "bne
> >>> 1, .LBB0_14" in BB7 were actually predicated on the "(eq, cr0)"
> >>> value, then this would all make sense (but it is not, and can't
> >>> be
> >>> on this architecture). FWIW, PPCInstrInfo::PredicateInstruction
> >>> is
> >>> never called on the "bne 1, .LBB0_14" instruction (if it were,
> >>> then
> >>> this would assert instead of being a misompile).
> >>>
> >>> Artyom, can you explain how the ARM equivalent of this leads to
> >>> correct code?
> >>>
> >>> Evan, as I believe that you wrote the original code in question
> >>> --
> >>> 6.5 years ago ;) -- perhaps you'll be able to lend some insight
> >>> here?
> >>>
> >>> Thanks again,
> >>> Hal
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>
> >> --
> >> Hal Finkel
> >> Assistant Computational Scientist
> >> Leadership Computing Facility
> >> Argonne National Laboratory
> >> _______________________________________________
> >> llvm-commits mailing list
> >> llvm-commits at cs.uiuc.edu
> >> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> >>
> >
> > --
> > Hal Finkel
> > Assistant Computational Scientist
> > Leadership Computing Facility
> > Argonne National Laboratory
>
>
--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory
More information about the llvm-commits
mailing list