<div dir="ltr">Hi Momchil,<div><br></div><div>So, I think to elaborate from the thread you're looking at separating out:</div><div><br></div><div>no tables,</div><div>exception handling,</div><div>instruction level unwind accuracy</div><div><br></div><div>for unwind tables? Some examples of cases you expect to work and explicitly not work in each of these would be fairly motivating. Going down the use cases for each.</div><div><br></div><div>Thanks!</div><div><br></div><div>-eric</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Nov 17, 2021 at 6:19 AM Momchil Velikov via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On one hand, we have the `uwtable` attribute in LLVM IR, which tells<br>

whether to emit CFI directives. On the other hand, we have the `clang<br>

-cc1` command-line option `-funwind-tables=1|2 ` and the codegen<br>

option `VALUE_CODEGENOPT(UnwindTables, 2, 0) ///< Unwind tables (1) or<br>

asynchronous unwind tables (2)`.<br>

Thus we lose along the way the information whether we want just some<br>

unwind tables or asynchronous unwind tables.<br>

<br>

Asynchronous unwind tables take more space in the runtime image, I'd<br>

estimate something like 80-90% more, as the difference is adding<br>

roughly the same number of CFI directives as for prologues, only a bit<br>

simpler (e.g. `.cfi_offset reg, off` vs. `.cfi_restore reg`). Or even<br>

more, if you consider tail duplication of epilogue blocks.<br>

Asynchronous unwind tables could also restrict code generation to<br>

having only a finite number of frame pointer adjustments (an example<br>

of *not* having a finite number of `SP` adjustments is on AArch64 when<br>

untagging the stack (MTE) in some cases the compiler can modify `SP`<br>

in a loop).<br>

Having the CFI precise up to an instruction generally also means one<br>

cannot bundle together CFI instructions once the prologue is done,<br>

they need to be interspersed with ordinary instructions, which means<br>

extra `DW_CFA_advance_loc` commands, further increasing the unwind<br>

tables size.<br>

<br>

That is to say, async unwind tables impose a non-negligible overhead,<br>

yet for the most common use cases (like C++ exceptions), they are not<br>

even needed.<br>

<br>

We could, for example, extend the `uwtable` attribute with an optional<br>

value, e.g.<br>

  -  `uwtable` (default to 2)<br>

  -  `uwtable(1)`, sync unwind tables<br>

  -  `uwtable(2)`, async unwind tables<br>

  -  `uwtable(3)`, async unwind tables, but tracking only a subset of<br>

registers (e.g. CFA and return address)<br>

<br>

Or add a new attribute `async_uwtable`.<br>

<br>

Other suggestions? Comments?<br>

<br>

~chill<br>

<br>

--<br>

Compiler scrub, Arm<br>

_______________________________________________<br>

LLVM Developers mailing list<br>

<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>

<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>

</blockquote></div>