[PATCH] D80344: [Windows SEH]: HARDWARE EXCEPTION HANDLING (MSVC -EHa) - Part 1

Thu Sep 24 01:29:56 PDT 2020

rjmccall added a comment.

Okay.  I think we're on the same page about representation now.  If you can come up with a good replacement for "eha" in the intrinsic names, I think this is pretty much ready to go.

================
Comment at: llvm/docs/LangRef.rst:11584
+These intrinsics make Block-level State computation possible in downstream
+LLVM code gen pass, even in the presence of ctor/dtor inlining.
+
----------------
This is a very emission-centric description of the intrinsic.  Maybe something like:

  LLVM's ordinary exception-handling representation associates EH cleanups and
  handlers only with ``invoke``s, which normally correspond only to call sites.  To
  support arbitrary faulting instructions, it must be possible to recover the current
  EH scope for any instruction.  Turning every operation in LLVM that could fault
  into an ``invoke`` of a new, potentially-throwing intrinsic would require adding a
  large number of intrinsics, impede optimization of those operations, and make
  compilation slower by introducing many extra basic blocks.  These intrinsics can
  be used instead to mark the region protected by a cleanup, such as for a local
  C++ object with a non-trivial destructor.  ``llvm.eha.scope.begin`` is used to mark
  the start of the region; it is aways called with ``invoke``, with the unwind block
  being the desired unwind destination for any potentially-throwing instructions
  within the region.  `llvm.eha.scope.end` is used to mark when the scope ends
  and the EH cleanup is no longer required (e.g. because the destructor is being
  called).

So, about that — how does this pairing actually work?  I think I understand how it's *meant* to work, by just scanning the structure of the CFG.  But Clang doesn't promise to emit blocks with that kind of structure, and in practice block structure gets pretty complicated when you have multiple edges out of a region and they have to go to destinations at different cleanup depths.  I don't really understand how your analysis looks through that sort of thing.

Example:

```
#include <string>
void test(int x, int &state) {
  state = 0;
  std::string a;
  state = 1;
  if (x > 0) {
    state = 2;
    std::string b;
    state = 3;
    if (x > 10) {
      state = 4;
      return;
    }
    state = 5;
    std::string c;
    state = 6;
  }
  state = 7;
}
```

IIRC, the way Clang will emit this is something like:

```
void test(int x, int &state) {
  int jumpdest;
  state = 0;
  std::string a;
  state = 1;
  if (x > 0) {
    state = 2;
    std::string b;
    state = 3;
    if (x > 10) {
      state = 4;
      jumpdest = 0;
      goto destroy_b;
    }
    state = 5;
    std::string c;
    state = 6;
    c.~std::string();
    jumpdest = 1;
  destroy_b:
    b.~std::string();
    switch (jumpdest) {
    case 0: goto destroy_a;
    case 1: goto fallthrough;
    }
  fallthrough:
    ;
  }
  state = 7;
destroy_a:
  a.~std::string();
}
```

The point being that I'm not sure the stack-like relationship between these cleanup scopes that's present in the source actually reliably survives into IR.  That makes me concerned about the implicit pairing happening with these intrinsics.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80344/new/

https://reviews.llvm.org/D80344