[PATCH] D102817: [Windows SEH]: HARDWARE EXCEPTION HANDLING (MSVC -EHa) - Part 2

Mon Apr 25 17:29:29 PDT 2022

tentzen added inline comments.

================
Comment at: llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp:1316
+        //  Div with variable opnds won't be the first instruction in
+        //  an EH region as it must be led by at least a Load
+        {
----------------
efriedma wrote:
> tentzen wrote:
> > efriedma wrote:
> > > If the issue here that the "end" label for a region can accidentally include the following instruction after the region?  Would it make sense to explicitly insert a nop earlier, in that case?
> > I don't see the value of emitting it earlier. I think the closer to emission is safer. Or there is always a risk of being changed or reordered by other phases. Besides, there are a couple of similar cases of emitting Nop in that file (like functionPrfix). I think inserting the nop where the EH_Label is emitted is safest and most straight-forward.
> Could you at least clarify the comment to clarify what aspect of the SEH region is problematic?  We currently use EH_LABEL for both the beginning and end of a region, and it isn't clear which one this is meant to apply to.
It's for both.
EH_LABEL marks the end of the previous region and the beginning of the following region. The Nop is added to prevent a potential-faulty-instruction from being included in the previous region and to ensure this potential faulty instruction still in the present region.
Yes, I will add more comments.

================
Comment at: llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp:2916
+      if (CleanupMBB) // a block referenced by EH table
+        CleanupMBB->setHasAddressTaken(); // so dtor-funclet not removed by opts
       break;
----------------
efriedma wrote:
> tentzen wrote:
> > efriedma wrote:
> > > I assume this is trying to trick optimizations into avoiding optimizations on the cleanup block if the __try involves multiple basic blocks.
> > > 
> > > Have you considered trying to fix the control flow graph?  I mean, I understand the difficulty of expressing the control flow at the IR level, given we don't have IR versions of "load" and "store" with an unwind edge, but those concerns don't really apply in the same way in a MachineFunction.  You can mark the cleanup block as a successor of any block that can throw an exception, without explicitly marking up the control flow on each individual instruction.  (This is how exceptions normally work.)
> > The Cleanup block is turned into a dtor funclet at the end. And the beginning address of that funclet will be recorded in EH table in .xdata section. So it's really address-taken by definition. If you verify this by checking Linker relocation records or the EH table in .asm file.  IMO this flag should be set regardless of the SEH feature.
> That's not what "address-taken" is supposed to mean.
> 
> It's supposed to be equivalent to BasicBlock::hasAddressTaken, which means that the block is the target of a blockaddress expression (used for indirectbr).  Primarily, this restricts certain optimizations because in general, we can't split a critical edge from an indirectbr.
> 
> hasAddressTaken() is not a general "will the address of the block eventually be emitted somewhere in the assembly".  That can happen for a variety of unrelated reasons, including exception handling, debug info, jump tables, and branch lowering.
I'm not sure about your interpretation.  the comment of this flag from the header is:

  /// Returns true if there are any uses of this basic block other than
  /// direct branches, switches, etc. to it.
  bool hasAddressTaken() const {
    return getBasicBlockBits().BlockAddressRefCount != 0;
  }
it does include "switch" which I think refers to jump-table.
Besides, the dtor-funclet in EH table is invoked via indirectbr/indirect-call.  Does not that also meet your interpretation?

================
Comment at: llvm/lib/CodeGen/WinEHPrepare.cpp:267-269
+//   Side exits can ONLY jump into parent scopes (lower state number).
+//   Thus, when a block succeeds various states from its predecessors,
+//     the lowest State triumphs others.
----------------
JosephTremoulet wrote:
> tentzen wrote:
> > JosephTremoulet wrote:
> > > tentzen wrote:
> > > > JosephTremoulet wrote:
> > > > > I think you're saying here that it's legal for a side exit to target a successor with any ancestor state.  But also you're assigning each such successor the exact parent state of its least predecessor ("least" by comparing state numbers).  How do you know that it shouldn't be the grandparent (or any other ancestor) of the least predecessor?
> > > > the new state only assigned to successor when it's smaller than successor's current state.  See the first line in the while loop. 
> > > > if ( ... && EHInfo.BlockToStateMap[BB] <= State) ..
> > > Yes I see that you're taking the state of the least connected predecessor.  What I'm not following is how you know that the least connected predecessor is a sibling, rather than say a neice -- what if the sibling ends in unreached, for example?  Something like:
> > > 
> > > ```
> > > B1: (state = 0
> > > { // new scope
> > > B2: (state = 1)
> > >   { // new scope
> > >   B2: (state = 2)
> > >      { // new scope
> > >      B3: (state = 3)
> > >         if (_) {
> > >         B4:
> > >             goto B7;
> > >         }
> > >         B5: (state = 3)
> > >         __assume(0);
> > >      } // exit scope
> > >      B6: (state TBD)
> > >   } // exit scope
> > >  B7: (state TBD)
> > > } // exit scope
> > > B8: (state TBD)
> > > ```
> > > 
> > > And maybe optimization turned __assume(0) into unreachable and got rid of B6 entirely.  So then B4 is the only predecessor of B7.  Doesn't that mean we'll assign state 2 to B7?  But B7 should have state 1.  How do we know this doesn't happen?
> > > 
> > > 
> > >   
> > Very good point. In this case, B7 actually is a side-entry that is led by a block of seh_scope_begin() intrinsic (see CpodeGen/CGStmt.cpp change in Part-1 patch).  As such, the state will be reset via EHInfo.InvokeStateMap[] when real code blocks are proceeded. 
> I apologize for the ridiculous delay in responding, until just now I was under the mistaken impression that you hadn't replied.
> 
> I'm still confused by the logic here.  Your comment states "Side exits can ONLY jump into parent scopes (lower state number)".  My example had a jump to a grandparent, and you pointed out that its destination would be marked as a side entry.  Are those the only two cases, would it be equally correct to say "Side exits can ONLY jump to either the immediate parent scope or a side-entry of another ancestor" ?  Or is there a way to go straight to grandparent that doesn't get a side entry?
> 
> I think an equivalent question is:  at the continue on line 300, could you assert `EHInfo.BlockToStateMap[BB] == State || (block is side entry)` ?
Sorry for the confusion. the comment "jump into parent scopes" here does not imply it's the immediate parent. it can be grandparents, etc as in your example. Will fix the comment.
Those two are actually refer to the same scenario.  Jumping into a protected parant/grandparent's scope induces a warning, and under that situation the target label is marked as side-entry (JumpDiagnostics.cpp change in Part-1 patch). 

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D102817/new/

https://reviews.llvm.org/D102817