[PATCH] D153098: [AArch64] Emit fewer CFI instructions for synchronous unwind tables

Igor Kudrin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jun 21 16:39:47 PDT 2023


ikudrin added inline comments.


================
Comment at: llvm/test/CodeGen/AArch64/addsub-constant-folding.ll:363
 ; CHECK-NEXT:    sub sp, sp, #32
-; CHECK-NEXT:    .cfi_def_cfa_offset 32
 ; CHECK-NEXT:    str x30, [sp, #16] // 8-byte Folded Spill
+; CHECK-NEXT:    .cfi_def_cfa_offset 32
----------------
smeenai wrote:
> MaskRay wrote:
> > efriedma wrote:
> > > MaskRay wrote:
> > > > smeenai wrote:
> > > > > MaskRay wrote:
> > > > > > smeenai wrote:
> > > > > > > ikudrin wrote:
> > > > > > > > MaskRay wrote:
> > > > > > > > > ikudrin wrote:
> > > > > > > > > > MaskRay wrote:
> > > > > > > > > > > `.cfi_def_cfa_offset 32` move like this seems unnecessary. It degrades precision without reducing the number of CFI instructions for sync.
> > > > > > > > > > The goal is to reduce the size of the unwind table, not just the number of CFI instructions. Having only one chunk removes `DWA_CFA_advance_loc` instructions that would otherwise be required.
> > > > > > > > > Yes, the goal is fine. However, when the number of instructions does not change, it would be better not to degrade the precision by moving `.cfi_def_cfa_offset`...
> > > > > > > > In this test, there is a `.cfi_offset w30, -16` instruction below. If we kept `.cfi_def_cfa_offset 32` there it was, the size of the unwind table would be bigger. Yes, only by two bytes, but still.
> > > > > > > Those bytes do add up ... I measured this change to be a 72 KiB size reduction for the Facebook for Android app, which is quite nice.
> > > > > > I wonder whether we just need a mode to omit most of the registers? Many registers are probably not useful for some use cases, e.g. profiling.
> > > > > That would make that mode unusable for exception handling, right? My interest here is having smaller unwind tables that can still be used for exceptions.
> > > > Am I mistaken that the number of CHECK lines in `addsub-constant-folding.ll` does not change with this patch? However, the patch degrades the precision of `.cfi_def_cfa_offset 32`. This seems a regression to me.
> > > > If the number of CFI instructions decreases, we can consider it different.
> > > > 
> > > > > I wonder whether we just need a mode to omit most of the registers? Many registers are probably not useful for some use cases, e.g. profiling.
> > > > 
> > > > I agree that for exception handling we need to preserve callee-saved registers.
> > > > 
> > > > I am mainly thinking of nounwind functions that get an unwind table due to `-funwind-tables`/`-fasynchronous-unwind-tables`. If we just need to get stack trace for these functions, it seems that we can have a mode to omit most callee-saved registers.
> > > >Am I mistaken that the number of CHECK lines in addsub-constant-folding.ll does not change with this patch?
> > > 
> > > It's the same number of lines of assembly, but it's fewer bytes in DWARF encoding.
> > Presumably DW_CFA_advance_loc. Yes, it's fewer bytes.
> Yup, exactly.
> 
> It makes sense to emit only the minimal amount of information for functions which don't need to be unwound through, but that seems like a separate change. With [unwindabort](https://discourse.llvm.org/t/rfc-add-call-unwindabort-to-llvm-ir/62543) and `-fno-exceptions=noexcept`, we'd have way more `nounwind` functions as well, so that'd be even more beneficial. I haven't looked into it very much, but what's your opinion on using SFrame for that?
> My interest here is having smaller unwind tables that can still be used for exceptions.

Mine is the same. The application shipped to end users does not need to support profiling, so it is beneficial to emit more compact unwind tables. They should be just enough for exceptions and crash dumps.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153098/new/

https://reviews.llvm.org/D153098



More information about the llvm-commits mailing list