[llvm-dev] [RFC] [Windows SEH] Local_Unwind (Jumping out of a _finally) and -EHa (Hardware Exception Handling)

Thu Apr 2 18:01:24 PDT 2020

Unwinding from SEH's perspective is to invoke outer _finally. Take this simple example below:

    volatile int* Fault = 0;
    try {
      try {
        *Fault += 1;
      }
      __finally {
        printf(" inner finally:  Counter = %d\n\r", ++Counter);
         goto t10;
    }
    __finally {
      printf(" outer finally  Counter = %d\n\r", ++Counter);
    }
    printf(" after outer try_finally: Counter = %d\n\r", Counter);
    t10:;
  ...
Before the control gets to  "t10:",  the outer _finally funclet is invoked by runtime.  Detailed steps:

  1.  Goto 10, call _local_unwind() runtime
  2.  _local_unwind() invoke outer _finally funclet
  3.  Then _local_unwind() passes control back to "t10:".

So with existent IR model, the reentrance from runtime to "t10:" is not seen by Optimizer.
Our proposed solution is to add a pseudo _try-except like below so the reentrance control-flow is represented in IR:

  try {   //  a pseudo try level to dispatch Local_unwind flow
    try {
      try {
        *Fault += 1;
      }
      __finally {
        printf(" inner finally:  Counter = %d\n\r", ++Counter);
         goto t10;
    }
    __finally {
      printf(" outer finally  Counter = %d\n\r", ++Counter);
    }
  } except (_IsLocalUnwind()) {
     goto t10;
  }
 printf(" after outer try_finally: Counter = %d\n\r", Counter);
t10:;

For C++ code, Going out of a catch-handler is simply.  For a similar example, the outer catch-handler is NOT invoked.  At the end of inner catch-handler, control directly passes back to t10:.
    try {
      try {
        throw(++Counter);
      }
      catch (...) {
        printf(" inner catch: goto : Counter = %d\n\r", ++Counter);
         goto t10;
      }
    catch(int i) {
      printf(" outer catch: Counter = %d\n\r", ++Counter);
    }
    printf(" after outer try_catch: Counter = %d\n\r", Counter);
  t10:;

  *   If you call a nounwind function, the invoke will be transformed to a plain call.  And we're likely to infer nounwind in many cases (for example, functions that don't call any other functions).  There isn't any way to stop this currently; I guess we could add one.

For -EHa where HW exception must be handled, nounwind-attribute is ignored (or reset) for callees directly inside a _try.

  *   I'm sort of unhappy with the fact that this is theoretically unsound, but maybe the extra effort isn't worthwhile, as long as it doesn't impact any transforms we realistically perform.  How much extra effort it would be sort of depends on what conclusion we reach for the "undefined behavior" part of this, which is really the part I'm more concerned about.

Which part (-EHa or Local_unwind) is theoretically unsound to you?  Could you be more specific what UB problem could arise in this design?

Thanks,

--Ten

From: Eli Friedman <efriedma at quicinc.com>
Sent: Thursday, April 2, 2020 1:49 PM
To: Ten Tzen <tentzen at microsoft.com>; llvm-dev <llvm-dev at lists.llvm.org>
Cc: Aaron Smith <aaron.smith at microsoft.com>
Subject: [EXTERNAL] RE: [llvm-dev] [RFC] [Windows SEH] Local_Unwind (Jumping out of a _finally) and -EHa (Hardware Exception Handling)

  *   When a goto in a _finally occurs, we must "unwind" to the target code, not just "jump" to target label

I'm not sure what you're trying to say here.  In the Microsoft ABI, goto out of a catch block also calls into the unwinder.  We have to run any destructors, and return from the funclet (catchret/cleanupret).

  *   The call inside a _try is an invoke with EH edge.  So it's perfectly modeled.

If you call a nounwind function, the invoke will be transformed to a plain call.  And we're likely to infer nounwind in many cases (for example, functions that don't call any other functions).  There isn't any way to stop this currently; I guess we could add one.

I'm sort of unhappy with the fact that this is theoretically unsound, but maybe the extra effort isn't worthwhile, as long as it doesn't impact any transforms we realistically perform.  How much extra effort it would be sort of depends on what conclusion we reach for the "undefined behavior" part of this, which is really the part I'm more concerned about.

-Eli

From: Ten Tzen <tentzen at microsoft.com<mailto:tentzen at microsoft.com>>
Sent: Wednesday, April 1, 2020 7:55 PM
To: Eli Friedman <efriedma at quicinc.com<mailto:efriedma at quicinc.com>>; llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>>
Cc: aaron.smith at microsoft.com<mailto:aaron.smith at microsoft.com>
Subject: [EXT] RE: [llvm-dev] [RFC] [Windows SEH] Local_Unwind (Jumping out of a _finally) and -EHa (Hardware Exception Handling)

?  Take your example, replace "_try" with C++ "try", replace the "_finally" with "catch(....)" with a "throw;" at the end of the catch block, replace the "_except()" with "catch(...)", and see what clang currently generates.  That seems roughly equivalent to what you're trying to do. Extending this scheme to encompass try/finally seems like it shouldn't require new datastructures in clang's AST, or new entrypoints in the C runtime.

?

?  But I could be missing something; I'm not deeply familiar with the differences between C++ and SEH unwind handlers.

?

Right, you are missing something.  The semantic of a "goto" from a SEH _finally is totally different from it's in EH Catch handler.  It's why I have illustrated the semantic of "jumping-out-of-a _finally" in the first example in the document.
When a goto in a _finally occurs, we must "unwind" to the target code, not just "jump" to target label.  This is why it's called "local_unwind()", depending on the EH state of the target, local_unwind() runtime invokes _finally properly alone the way to final target.  Again, take the case #2 as example, the outer _finally must be invoked before control goes to $t10.

?  To be clear, we're talking about making all memory accesses, including accesses to local variables, in the try block "volatile"? So the compiler can't do any optimization on them?  That gets you some fraction of the way there; there are no issues with SSA registers if there aren't any live SSA across the edge.  And the compiler can't move volatile operations around each other.  That leaves open the question about what to do about calls; we don't have any generic way to mark a call "volatile".  I guess we could add something.  At that point, basically every memory operation and variable would be completely opaque to the compiler, which would sort of force everything to work, I guess. But at the cost of terrible performance if there's any non-trivial code in the block.  (And it's still not theoretically sound, because the compiler can introduce local variables.)

The call inside a _try is an invoke with EH edge.  So it's perfectly modeled. A HW exception occurs in callee will be properly caught and handled.
Volatizing the _try block is done in Clang FE.  So LLVM BE temporary variables will not be volatile.

Finally I would not say it's at the cost of terrible performance because:

(1)    Again, in really world code, it's very small amount of code are directly inside a _try, and they are mostly not performance critical.

(2)    If the HW exception flow is perfectly modeled with iload/istore or with pointer-test explicit flow model, likely optimizations will be severely hindered.  The result code will be probably not much better than volatile code.

Thanks,

--Ten

From: Eli Friedman <efriedma at quicinc.com<mailto:efriedma at quicinc.com>>
Sent: Wednesday, April 1, 2020 5:41 PM
To: Ten Tzen <tentzen at microsoft.com<mailto:tentzen at microsoft.com>>; llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>>
Cc: Aaron Smith <aaron.smith at microsoft.com<mailto:aaron.smith at microsoft.com>>
Subject: [EXTERNAL] RE: [llvm-dev] [RFC] [Windows SEH] Local_Unwind (Jumping out of a _finally) and -EHa (Hardware Exception Handling)

Reply inline

From: Ten Tzen <tentzen at microsoft.com<mailto:tentzen at microsoft.com>>
Sent: Wednesday, April 1, 2020 3:54 PM
To: Eli Friedman <efriedma at quicinc.com<mailto:efriedma at quicinc.com>>; llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>>
Cc: aaron.smith at microsoft.com<mailto:aaron.smith at microsoft.com>
Subject: [EXT] RE: [llvm-dev] [RFC] [Windows SEH] Local_Unwind (Jumping out of a _finally) and -EHa (Hardware Exception Handling)

?  For goto in finally, why are you inventing a completely new mechanism for handling this sort of construct?  What makes this different from our existing handling of goto out of catch blocks?  Maybe there's something obvious here I'm missing, but it looks like essentially the same problem, and I don't see any reason why we can't use the existing solution.

No, no new mechanism is invented.  The design employs the existing mechanism to model the third exception path caused by _local_unwind (in addition to normal execution and exception handling flow).  In earlier discussion with Joseph, adding second EH edge to InvokeInst was briefly discussed, but was quickly dropped as it's clearly a long shot.

Yes, right, it's not really a big extension of the fundamental model.  It still seems like you're doing more than what's necessary.

The extended model intends to solve the third control-flow that doesn't seem representable today.
Take case #2 of the first example in wiki page as an example,
the control flowing from normal execution of inner _finlly, passing through outer _finally, and landing in $t10 cannot be represented by LLVM IR.
Or could you elaborate how to achieve it?  (Bear with me as I'm new in Clang&LLVM world).

Take your example, replace "_try" with C++ "try", replace the "_finally" with "catch(....)" with a "throw;" at the end of the catch block, replace the "_except()" with "catch(...)", and see what clang currently generates.  That seems roughly equivalent to what you're trying to do. Extending this scheme to encompass try/finally seems like it shouldn't require new datastructures in clang's AST, or new entrypoints in the C runtime.

But I could be missing something; I'm not deeply familiar with the differences between C++ and SEH unwind handlers.

?  ..In general, UB means the program can do anything.

Sorry, what is UB?

Undefined behavior.

Right we are not modeling HW exception in control-flow as it's not necessary.
For C++ code, we don't care about the value in register, local variable, SSA and so on.  All we need is that "live local-objects got dtored properly when HW exception is unwound and handled".
For C code, only those code under _try construct is affected.  Agree that making memory accesses there volatile is sub-optimal. But it should not have correctness issue.

To be clear, we're talking about making all memory accesses, including accesses to local variables, in the try block "volatile"? So the compiler can't do any optimization on them?  That gets you some fraction of the way there; there are no issues with SSA registers if there aren't any live SSA across the edge.  And the compiler can't move volatile operations around each other.  That leaves open the question about what to do about calls; we don't have any generic way to mark a call "volatile".  I guess we could add something.  At that point, basically every memory operation and variable would be completely opaque to the compiler, which would sort of force everything to work, I guess. But at the cost of terrible performance if there's any non-trivial code in the block.  (And it's still not theoretically sound, because the compiler can introduce local variables.)

In MSVC, there is one less restricted "write-through" concept for memory access inside a _try.  But I think the benefit of it is minor and it's not worth it as the amount of code directly under _try is very small, and usually is not performance critical code.

?  ..I don't want to add another way for unmodeled control flow to break code.

I would really love to hear (and find a way to improve) if there is any place in this design & implementation which is not sound or robust.

Thanks,

--Ten

From: Eli Friedman <efriedma at quicinc.com<mailto:efriedma at quicinc.com>>
Sent: Wednesday, April 1, 2020 1:20 PM
To: Ten Tzen <tentzen at microsoft.com<mailto:tentzen at microsoft.com>>; llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>>
Cc: Aaron Smith <aaron.smith at microsoft.com<mailto:aaron.smith at microsoft.com>>
Subject: [EXTERNAL] RE: [llvm-dev] [RFC] [Windows SEH] Local_Unwind (Jumping out of a _finally) and -EHa (Hardware Exception Handling)

Resending; I accidentally dropped llvm-dev.

-Eli

From: Eli Friedman
Sent: Wednesday, April 1, 2020 1:01 PM
To: Ten Tzen <tentzen at microsoft.com<mailto:tentzen at microsoft.com>>
Cc: aaron.smith at microsoft.com<mailto:aaron.smith at microsoft.com>
Subject: RE: [EXT] [llvm-dev] [RFC] [Windows SEH] Local_Unwind (Jumping out of a _finally) and -EHa (Hardware Exception Handling)

This looks like it outlines the implementation pretty well.

For goto in finally, why are you inventing a completely new mechanism for handling this sort of construct?  What makes this different from our existing handling of goto out of catch blocks?  Maybe there's something obvious here I'm missing, but it looks like essentially the same problem, and I don't see any reason why we can't use the existing solution.

For hardware exceptions, the proposal seems to have big fundamental problems.  I see two basic problems:

How do you actually generate an exception?  In general, UB means the program can do anything.  So unless you define some rule that says otherwise, the only defined way to trigger an exception is using Windows API calls.  If you want something else, we need to define new rules.  At the C level, we need to redefine some specific constructs to trigger an exception instead of UB.  And at the IR level, we need to annotate specific IR instructions in a way that passes can reasonably check, and add new LangRef rules describing those semantics.  I mean, you can try to sort of hand-wave this and say it should "just work" if code happens to trigger a hardware exception.  But if there aren't actually any rules, I'm afraid we'll end up with an infinitely long tail of "optimization X breaks some customer's code, so add a hack to disable it in EHa mode".

If we're not modeling the control flow implied by an exception, how do we ensure that local variables and SSA registers have the right values when the exception is caught?  Sure, invoke is clunky, but it's at least makes control flow well-defined.  Adding "volatile" to every IR load and store instruction, including accesses to local variables, seems terrible for both optimization and correctness.  Our handling of setjmp is already a complete mess; I don't want to add another way for unmodeled control flow to break code.  (See also http://nondot.org/sabre/LLVMNotes/ExceptionHandlingChanges.txt<https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fnondot.org%2Fsabre%2FLLVMNotes%2FExceptionHandlingChanges.txt&data=02%7C01%7Ctentzen%40microsoft.com%7Ca5501a890b284da77b8108d7d74740de%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637214573350550582&sdata=c4Wp8RZlg%2BN7uY1yqWNoZYQhZbnLvhdljVh7cybJ4Lc%3D&reserved=0>, for a proposal to make invoke less messy.)

-Eli

From: llvm-dev <llvm-dev-bounces at lists.llvm.org<mailto:llvm-dev-bounces at lists.llvm.org>> On Behalf Of Ten Tzen via llvm-dev
Sent: Tuesday, March 31, 2020 9:13 PM
To: llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
Cc: aaron.smith at microsoft.com<mailto:aaron.smith at microsoft.com>
Subject: [EXT] [llvm-dev] [RFC] [Windows SEH] Local_Unwind (Jumping out of a _finally) and -EHa (Hardware Exception Handling)

Hi, all,

The intend of this thread is to complete the support for Windows SEH.
Currently there are two major missing features:  Jumping out of a _finally and Hardware exception handling.

The document below is my proposed design and implementation to fully support SEH on LLVM.
I have completely implemented this design on a branch in repo:  https://github.com/tentzen/llvm-project<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Ftentzen%2Fllvm-project&data=02%7C01%7Ctentzen%40microsoft.com%7Ca5501a890b284da77b8108d7d74740de%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637214573350560564&sdata=AoFECMFzOE0Bd0PXl%2BzRWO36k3t6V8F7GjNmTktzY6w%3D&reserved=0>.
It now passes MSVC's in-house SEH suite.

Sorry for this long write-up.  For better readability, please read it on https://github.com/tentzen/llvm-project/wiki<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Ftentzen%2Fllvm-project%2Fwiki&data=02%7C01%7Ctentzen%40microsoft.com%7Ca5501a890b284da77b8108d7d74740de%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637214573350560564&sdata=QGzsWmEot7%2B8EPSQk6LyH%2FWnhqYWqC07nGNrmqfFVWk%3D&reserved=0>

Special thanks to Joseph Tremoulet for his earlier comments and suggestions.

Note: I just subscribed llvm-dev, probably not in the list yet.  So please reply with my email address (tentzen at microsoft.com<mailto:tentzen at microsoft.com>) explicitly in To-list.
Thanks,

--Ten

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200403/6cfef240/attachment.html>