[llvm] r190309 - [ARMv8] Prevent generation of deprecated IT blocks on ARMv8 in Thumb mode.

Evan Cheng evan.cheng at apple.com
Wed Dec 11 12:03:24 PST 2013


On Dec 10, 2013, at 4:38 PM, Hal Finkel <hfinkel at anl.gov> wrote:

> ----- Original Message -----
>> From: "Evan Cheng" <evan.cheng at apple.com>
>> To: "Artyom Skrobov" <Artyom.Skrobov at arm.com>
>> Cc: "Jim Grosbach" <grosbach at apple.com>, "Hal Finkel" <hfinkel at anl.gov>, "llvm-commits" <llvm-commits at cs.uiuc.edu>,
>> "Bill Wendling" <isanbard at gmail.com>
>> Sent: Tuesday, December 10, 2013 6:18:14 PM
>> Subject: Re: [llvm] r190309 - [ARMv8] Prevent generation of deprecated IT blocks	on ARMv8 in Thumb mode.
>> 
>> 
>> 
>> 
>> On Dec 10, 2013, at 4:13 PM, Artyom Skrobov < Artyom.Skrobov at arm.com
>>> wrote:
>> 
>> 
>> Jim, Evan et al:
>> 
>> I want to remind that three weeks ago, I suggested a patch that
>> restores IfConversion.cpp to its pre-r190309 state, and reimplements
>> the change necessary for the IT blocks in ARMv8 in
>> Thumb2InstrInfo.cpp instead.
>> My patch is archived at
>> http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20131118/196278.html
>> , and Hal's regression test to go it with it, at
>> http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20131125/196852.html
>> However, committing this was delayed due to Hal's concern about
>> skipping over the unpredicated conditional branches, in the
>> pre-r190309 implementation , which may not always be valid.
>> 
>> 
>> Can you work towards resolving this? It seems clear (unless I
>> misunderstand) that r190309 is causing problems.
> 
> r190309 does trigger the bugs in self-hosting on PowerPC, and Artyom's fix does fix that problem without ARM performance regressions (I assume). If nothing else, we should use it. I'm concerned that the existing special case is also problematic (just even less likely to trigger a bug), and I was hoping to understand it better.
> 
> Given that, as far as I can tell, you wrote the code in question, can you explain the purpose of, in IfConverter::ScanInstructions:
> 
>    if (BBI.ClobbersPred && !isPredicated) {
>      // Predicate modification instruction should end the block (except for
>      // already predicated instructions and end of block branches).
>      if (isCondBr) {
>        // A conditional branch is not predicable, but it may be eliminated.
>        continue;
>      }
> 
> It looks like this is a refactored version (r37543) of the logic from r37518.


It has been a long time.  Let me see...

This has to do with nested if-conversion (wired-and). When there are nested ifs, we can run into situation where a block is already predicated but its predicate can be subsumed. In this case, it's possible to if-convert the BB. However, the BB has to be fully predicated. If the BB has a compare and follow by unpredicated instructions, then it's no legal to if-convert it. The only exception is the end of the BB conditional branches since they will be eliminated.

Evan

> 
> Thanks again,
> Hal
> 
>> 
>> 
>> Evan
>> 
>> 
>> 
>> 
>> This is why the solution isn't as straightforward as just reverting
>> r190309.
>> 
>> 
>> On Tue Dec 10 22:52:16 GMT 2013, "Evan Cheng" < evan.cheng at apple.com
>>> wrote:
>> 
>> 
>> 
>> 
>> 
>> On Dec 10, 2013, at 11:59 AM, Jim Grosbach < grosbach at apple.com >
>> wrote:
>> 
>> 
>> 
>> I find it extremely suspect that a patch to deprecate IT blocks for
>> ARM required changes to the generic IfConverter pass at all.
>> 
>> I agree. This is very suspicious.
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> The symptoms here seem very similar to the earlier problem I referred
>> to, which was r187071.
>> 
>> 
>> 
>> 
>> On Dec 10, 2013, at 11:45 AM, Hal Finkel < hfinkel at anl.gov > wrote:
>> 
>> 
>> 
>> Jim,
>> 
>> Thanks! If you have another minute, let me quickly summarize my
>> current understanding:
>> 
>> In IfConverter::ScanInstructions, pre-r190309, we had the following
>> special case:
>> 
>> if (BBI.ClobbersPred && !isPredicated) {
>> // Predicate modification instruction should end the block (except
>> for
>> // already predicated instructions and end of block branches).
>> if (isCondBr) {
>> // A conditional branch is not predicable, but it may be eliminated.
>> continue;
>> }
>> 
>> // Predicate may have been modified, the subsequent (currently)
>> // unpredicated instructions cannot be correctly predicated.
>> BBI.IsUnpredicable = true;
>> return;
>> }
>> 
>> ...
>> 
>> if (!TII->isPredicable(I)) {
>> BBI.IsUnpredicable = true;
>> return;
>> }
>> 
>> and so, for an unpredicated conditional branch, in a block which
>> clobbers the predicate, we would not check whether or not the branch
>> could be predicated under some assumption that it will be later
>> removed. I don't understand this, but as Artyom pointed out to me,
>> the relevant code is in IfConverter::IfConvertTriangle:
>> 
>> if (CvtBBI->BB->pred_size() > 1) {
>> ...
>> } else {
>> // Predicate the 'true' block after removing its branch.
>> CvtBBI->NonPredSize -= TII->RemoveBranch(*CvtBBI->BB);
>> PredicateBlock(*CvtBBI, CvtBBI->BB->end(), Cond, Redefs);
>> 
>> // Now merge the entry of the triangle with the true block.
>> BBI.NonPredSize -= TII->RemoveBranch(*BBI.BB);
>> MergeBlocks(BBI, *CvtBBI, false);
>> }
>> 
>> then in r190309 this special case was expanded to include *all*
>> conditional branches (whether or not they are in
>> predicate-clobbering blocks). This is what triggers the self-hosting
>> failure on PowerPC. For example, we now end up with code like this:
>> 
>> 
>> 
>> 
>> That change sounds really questionable, even for ARM.
>> 
>> The fact that r190309 required changes to IfConverter.cpp is very
>> questionable. The change which had to do with conditional branches
>> seems suspicious. Is reverting the IfConverter.cpp part of r190309
>> the right solution for now?
>> 
>> 
>> Evan
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> ...
>> .LBB0_6: # %if.end7.i37
>> lbz 4, 0(3)
>> li 30, 1
>> cmplwi 0, 4, 0
>> beq 0, .LBB0_15
>> # BB#7: # %if.then9.i39
>> bne 1, .LBB0_14
>> b .LBB0_15
>> .LBB0_8: # %if.end
>> ...
>> 
>> BB6
>> | \
>> | BB7
>> | / \
>> | / \
>> BB15 BB14
>> 
>> being transformed into:
>> 
>> ...
>> .LBB0_6: # %if.end7.i37
>> lbz 4, 0(3)
>> li 30, 1
>> cmplwi 0, 4, 0
>> bne 1, .LBB0_13
>> b .LBB0_14
>> .LBB0_7: # %if.end
>> ...
>> 
>> (adjusting for the later block numbering change)
>> BB6
>> | \
>> | \
>> BB15 BB14
>> 
>> but without combining the conditions from BB6 and BB7 (but instead
>> just eliminated one). Now if the "bne 1, .LBB0_14" in BB7 were
>> actually predicated on the "(eq, cr0)" value, then this would all
>> make sense (but it is not, and can't be on this architecture).
>> TII->isPredicable on this instruction returns false (which I think
>> should prevent this problem in the first place, but does not,
>> because of this special case for conditional branches).
>> 
>> 
>> 
>> Ouch. That’s really nasty. :(
>> 
>> 
>> 
>> 
>> My fear is that the special case pre-r190309 was also wrong for
>> PowerPC, and that by restoring it, we're just reinstating an
>> even-more-subtle bug for non-ARM backends.
>> 
>> 
>> 
>> That’s possible. I don’t know enough about the PPC details to know.
>> The theory is that the TII->SubsumesPredicate() check can help
>> disambiguate strange cases like this, but that’s generally for
>> non-branch instructions.
>> 
>> 
>> -Jim
>> 
>> 
>> 
>> 
>> -Hal
>> 
>> ----- Original Message -----
>> 
>> 
>> From: "Jim Grosbach" < grosbach at apple.com >
>> To: "Hal Finkel" < hfinkel at anl.gov >
>> Cc: "llvm-commits" < llvm-commits at cs.uiuc.edu >, "Bill Wendling" <
>> isanbard at gmail.com >, "Artyom Skrobov"
>> < Artyom.Skrobov at arm.com >
>> Sent: Tuesday, December 10, 2013 12:47:23 PM
>> Subject: Re: [llvm] r190309 - [ARMv8] Prevent generation of
>> deprecated IT blocks on ARMv8 in Thumb mode.
>> 
>> Hi Hal,
>> 
>> I’m not very familiar with the patch this thread is in response to,
>> so I can’t comment very effectively on that aspect of things
>> directly.
>> 
>> Your comment below about multiple branch instructions at the end of a
>> basic block, however, is correct. The IfConversion pass needs to
>> handle those sorts of cases, including degenerate cases where
>> AnalyzeBranch() fails (by punting and not doing if-conversion of
>> that).
>> 
>> There are helper functions in the ARM backend (and I assume others)
>> which deal with getting the inverted condition code for
>> if-converting if/else constructs. Perhaps there’s something wrong
>> there either in the general pass or the PPC backend? I seem to
>> recall there was a nasty bug in the ARM version a while back that
>> was painful to track down.
>> 
>> -Jim
>> 
>> On Dec 10, 2013, at 10:24 AM, Hal Finkel < hfinkel at anl.gov > wrote:
>> 
>> 
>> 
>> Jim, as you probably saw, Bill recommended you as someone who could
>> help clear up the confusion here. I would appreciate your input; I
>> need to get this cleared up this week (given our current release
>> schedule).
>> 
>> Thanks in advance,
>> Hal
>> 
>> ----- Original Message -----
>> 
>> 
>> From: "Hal Finkel" < hfinkel at anl.gov >
>> To: "Artyom Skrobov" < Artyom.Skrobov at arm.com >
>> Cc: llvm-commits at cs.uiuc.edu
>> Sent: Tuesday, December 3, 2013 11:02:29 AM
>> Subject: Re: [llvm] r190309 - [ARMv8] Prevent generation of
>> deprecated IT blocks on ARMv8 in Thumb mode.
>> 
>> ----- Original Message -----
>> 
>> 
>> From: "Artyom Skrobov" < Artyom.Skrobov at arm.com >
>> To: "Hal Finkel" < hfinkel at anl.gov >
>> Cc: "Joey Gouly" < Joey.Gouly at arm.com >, llvm-commits at cs.uiuc.edu ,
>> "Tim Northover" < t.p.northover at gmail.com >, "Evan
>> Cheng" < evan.cheng at apple.com >
>> Sent: Monday, December 2, 2013 12:23:27 PM
>> Subject: RE: [llvm] r190309 - [ARMv8] Prevent generation of
>> deprecated IT blocks on ARMv8 in Thumb mode.
>> 
>> Hal, could you please clarify why the PPC backend produces
>> BasicBlocks with several branch instructions sharing the same BB?
>> 
>> The doxygen comments in LLVM imply that a BB is a "single entry,
>> single exit" sequence, and any branch instruction in a BB must
>> terminate it.
>> 
>> That seems mostly right, but a block can have multiple terminators
>> (if one is conditional and one is unconditional), and the
>> unconditional branch may be elided if its destination is the next
>> block in layout ordering. Taking a quick look at AnalyzeBranch in
>> the various backends, I believe that this is true also on X86,
>> Hexagon, Sparc, SystemZ, etc. This allows forming a TBB/FBB pair
>> using two consecutive branches.
>> 
>> Is this somehow incompatible with the design of the if conversion
>> pass?
>> 
>> Thanks again,
>> Hal
>> 
>> 
>> 
>> 
>> As you can see in the code examples I posted earlier, this is
>> indeed
>> the case with the ARM backend.
>> 
>> 
>> -----Original Message-----
>> From: Hal Finkel [ mailto:hfinkel at anl.gov ]
>> Sent: 28 November 2013 19:50
>> To: Artyom Skrobov
>> Cc: Joey Gouly; llvm-commits at cs.uiuc.edu ; Tim Northover; Evan
>> Cheng
>> Subject: Re: [llvm] r190309 - [ARMv8] Prevent generation of
>> deprecated IT blocks on ARMv8 in Thumb mode.
>> 
>> 
>> Thanks for the explanation! I still don't understand, however,
>> under
>> what set of assumptions this is generally valid. In the test case
>> showing the problem with the PPC backend, we have:
>> 
>> ...
>> .LBB0_6: # %if.end7.i37
>> lbz 4, 0(3)
>> li 30, 1
>> cmplwi 0, 4, 0
>> beq 0, .LBB0_15
>> # BB#7: # %if.then9.i39
>> bne 1, .LBB0_14
>> b .LBB0_15
>> .LBB0_8: # %if.end
>> ...
>> 
>> BB6
>> | \
>> | BB7
>> | / \
>> | / \
>> BB15 BB14
>> 
>> being transformed into:
>> 
>> ...
>> .LBB0_6: # %if.end7.i37
>> lbz 4, 0(3)
>> li 30, 1
>> cmplwi 0, 4, 0
>> bne 1, .LBB0_13
>> b .LBB0_14
>> .LBB0_7: # %if.end
>> ...
>> 
>> (adjusting for the later block numbering change)
>> BB6
>> | \
>> | \
>> BB15 BB14
>> 
>> but it did so without combining the conditions from BB6 and BB7
>> (but
>> instead just eliminated one).
>> 
>> I don't understand how this could be generally valid. It looks
>> like
>> what is happening is that, in the (current) broken state,
>> ScanInstructions returns true on BB7 (because it can predicate
>> the
>> unconditional branch, and it skips checking the conditional
>> branch),
>> and then tries to predicate BB7. BB7 has nothing in it, however,
>> except for its branches, and so nothing predicates. Now if the
>> "bne
>> 1, .LBB0_14" in BB7 were actually predicated on the "(eq, cr0)"
>> value, then this would all make sense (but it is not, and can't
>> be
>> on this architecture). FWIW, PPCInstrInfo::PredicateInstruction
>> is
>> never called on the "bne 1, .LBB0_14" instruction (if it were,
>> then
>> this would assert instead of being a misompile).
>> 
>> Artyom, can you explain how the ARM equivalent of this leads to
>> correct code?
>> 
>> Evan, as I believe that you wrote the original code in question
>> --
>> 6.5 years ago ;) -- perhaps you'll be able to lend some insight
>> here?
>> 
>> Thanks again,
>> Hal
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> --
>> Hal Finkel
>> Assistant Computational Scientist
>> Leadership Computing Facility
>> Argonne National Laboratory
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>> 
>> 
>> --
>> Hal Finkel
>> Assistant Computational Scientist
>> Leadership Computing Facility
>> Argonne National Laboratory
>> 
>> 
>> 
>> --
>> Hal Finkel
>> Assistant Computational Scientist
>> Leadership Computing Facility
>> Argonne National Laboratory
>> 
>> 
>> 
> 
> -- 
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory





More information about the llvm-commits mailing list