[PATCH] D102817: [Windows SEH]: HARDWARE EXCEPTION HANDLING (MSVC -EHa) - Part 2

Ten Tzen via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Apr 29 23:06:50 PDT 2022


tentzen added inline comments.


================
Comment at: llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp:2916
+      if (CleanupMBB) // a block referenced by EH table
+        CleanupMBB->setHasAddressTaken(); // so dtor-funclet not removed by opts
       break;
----------------
efriedma wrote:
> tentzen wrote:
> > efriedma wrote:
> > > tentzen wrote:
> > > > efriedma wrote:
> > > > > tentzen wrote:
> > > > > > efriedma wrote:
> > > > > > > tentzen wrote:
> > > > > > > > efriedma wrote:
> > > > > > > > > I assume this is trying to trick optimizations into avoiding optimizations on the cleanup block if the __try involves multiple basic blocks.
> > > > > > > > > 
> > > > > > > > > Have you considered trying to fix the control flow graph?  I mean, I understand the difficulty of expressing the control flow at the IR level, given we don't have IR versions of "load" and "store" with an unwind edge, but those concerns don't really apply in the same way in a MachineFunction.  You can mark the cleanup block as a successor of any block that can throw an exception, without explicitly marking up the control flow on each individual instruction.  (This is how exceptions normally work.)
> > > > > > > > The Cleanup block is turned into a dtor funclet at the end. And the beginning address of that funclet will be recorded in EH table in .xdata section. So it's really address-taken by definition. If you verify this by checking Linker relocation records or the EH table in .asm file.  IMO this flag should be set regardless of the SEH feature.
> > > > > > > That's not what "address-taken" is supposed to mean.
> > > > > > > 
> > > > > > > It's supposed to be equivalent to BasicBlock::hasAddressTaken, which means that the block is the target of a blockaddress expression (used for indirectbr).  Primarily, this restricts certain optimizations because in general, we can't split a critical edge from an indirectbr.
> > > > > > > 
> > > > > > > hasAddressTaken() is not a general "will the address of the block eventually be emitted somewhere in the assembly".  That can happen for a variety of unrelated reasons, including exception handling, debug info, jump tables, and branch lowering.
> > > > > > I'm not sure about your interpretation.  the comment of this flag from the header is:
> > > > > >   
> > > > > >   /// Returns true if there are any uses of this basic block other than
> > > > > >   /// direct branches, switches, etc. to it.
> > > > > >   bool hasAddressTaken() const {
> > > > > >     return getBasicBlockBits().BlockAddressRefCount != 0;
> > > > > >   }
> > > > > > it does include "switch" which I think refers to jump-table.
> > > > > > Besides, the dtor-funclet in EH table is invoked via indirectbr/indirect-call.  Does not that also meet your interpretation?
> > > > > > 
> > > > > "other than direct branches, switches, etc." means MachineBasicBlock terminators, exception pads, and jump tables.  Anything which is *not* one of those, we mark up, because target-independent code can't analyze it.  But in almost all cases, the above list covers basically everything except indirectbr until you get very late in the compile process.
> > > > > 
> > > > > If we do end up performing some transform that materializes the address of a block outside of the above cases, it could make sense to mark a block address-taken.  (Some target-specific code does this.  Not sure if that code is really doing it correctly, but none of those calls are relevant to this patch.)  But we should not be preemptively marking basic block address-taken: we should wait until such a transform is actually performed.
> > > > > 
> > > > > For a dtor-funclet, we don't actually "take the address" of the funclet until the AsmPrinter, when we compute the exception tables.  We could add an address-taken marking at that point... but nothing would check it that late in the pipeline.
> > > > This is exactly the case that a target-independent code (like BranchFolding.cpp) must be aware of it. 
> > > > The dtor-funclet cleanup block is genuinely address-taken the moment it's created.
> > > > Yes, it's not emitted until AsmPrinter. but it does not mean the block is not address-taken until AsmPrinter. 
> > > > 
> > > There's not really any point to arguing what the person who originally wrote the comment meant. Sorry, I shouldn't have tried to take the discussion in that direction.
> > > 
> > > But I still think this is confusing: there's multiple different use-cases smashed together into one boolean.
> > > 
> > > How about this: we split up isBlockAddressTarget(), isAsyncSEHTarget(), and maybe isMachineLabelUsed() for the target-specific stuff that messes with MachineBasicBlock labels in weird ways.  And then as a transitional measure hasAddressTaken() checks all of them, maybe?  Really, I expect most of the relevant places that would check hasAddressTaken(), or something like that.
> > > 
> > > Then we can go through later and refine various places to check exactly what they need.  I suspect most places don't actually need all three.
> > I feel doing that is over-engineering and I don't see much value out of it.
> > 
> > note that in addition to BasicBlock.h, it's also commented in MachineBasicBlock.h :
> > 
> >   /// Test whether this block is **potentially the target of an indirect branch.**
> >   bool hasAddressTaken() const { return AddressTaken; }
> > 
> > Also, some present usages are far from what you described. 
> > See the usages in RetpolineThunkInserter where CaptureSpec and CallTarget blocks are not even address-taken.
> > 
> > The usage of hasAddressTaken() in this revision is much more appropriate and legitimate.
> Looked a bit more.  It turns out there's an actual bug here: mixing up the different markings could confuse the AsmPrinter if we mix together the different bits.  Posted D124697.
> 
> With that, the "MBBLabelUsed" is probably close enough to what you need, if we need something.  I guess we don't really need to distinguish between the SEH uses and other target-specific uses.
> 
> I'm still not sure where, exactly, the MBB is being stored that necessitates this.  If I look at an MIR dump, the only place the EHPad shows up is as a successor of the block that can throw an exception. And as far as I can tell, none of the side-tables in WinEHFuncInfo contain any references to MBBs.
The EH table is named $stateUnwindMap$**, the 'state' and 'action' map table.  It is stored in $cppxdata$** that is the primary EH table passed to _CxxFrameHandler3/4 by __ehhandler$**.  Check the new test case windows-seh-EHa-cpp-Dtors01.ll.
LLVM code that emits it is in WinException.cpp:

  // UnwindMapEntry {
  //   int32_t ToState;
  //   void  (*Action)();
  // };
  if (UnwindMapXData) {
    OS.emitLabel(UnwindMapXData);
    for (const CxxUnwindMapEntry &UME : FuncInfo.CxxUnwindMap) {
      MCSymbol *CleanupSym =
          getMCSymbolForMBB(Asm, UME.Cleanup.dyn_cast<MachineBasicBlock *>());
      AddComment("ToState");
      OS.emitInt32(UME.ToState);

      AddComment("Action");
      OS.emitValue(create32bitRef(CleanupSym), 4);
    }

note getMCSymbolForMBB() that refers to the ehcleanup block we were talking about here.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D102817/new/

https://reviews.llvm.org/D102817



More information about the llvm-commits mailing list