[libcxx-commits] [PATCH] D86805: [lld-macho] create TEXT, unwind_info from LD, compact_unwind

Wed Sep 2 00:05:07 PDT 2020

gkm added inline comments.

================
Comment at: lld/MachO/UnwindInfo.h:51-66
+  mutable std::map<uint32_t, size_t> commonEncodingIndexes;
+  mutable std::vector<std::pair<compact_unwind_encoding_t, size_t>>
+      commonEncodings;
+  mutable std::vector<uint32_t> personalities;
+  mutable std::vector<unwind_info_section_header_lsda_entry> lsdaEntries;
+  mutable std::vector<compact_unwind_entry> cuVector;
+  mutable std::vector<compact_unwind_entry *> cuPtrVector;
----------------
int3 wrote:
> int3 wrote:
> > gkm wrote:
> > > int3 wrote:
> > > > why the need to mark everything as mutable?
> > > The work of morphing `__LD,__compact_unwind` into `__TEXT,__unwind_info` happens in `uint64_t getSize() const` and `void writeTo(uint8_t *buf) const`, which can only be logically `const`, but not physically.
> > I would much prefer if we could have a method that explicitly computes the unwind info... but first let me see if I understand the constraints here.
> > 
> > From what I understand, the size of the unwind info depends on the values of `CompactUnwindEntry64::functionAddress`, or more precisely, their relative offsets. So we need to assign addresses to our functions before computing the unwind info size. After addresses are assigned, the `cuSection->writeTo()` call below will update the `functionAddress` entries to point to the final addresses, after which we can compute the needed size. Does all that sound right?
> > 
> > From what I can tell, ld64 calculates unwind info size based on "tentative addresses", which I think have the right relative offsets, though their final values have yet to be fixed. I wonder if we could do something similar with our current design...
> Also -- do you know if it's possible for unwind info to point to functions that reside in sections other than `__text`?
Yes, that's substantially correct, and I will elaborate:

`UnwindInfoSection::getSize()` must process the compact unwind entries and partition them into 4 KiB second-level pages in order to determine the final size of the `__TEXT,__unwind_info`. There are three factors that influence layout & therefore size:
  # A maximum of 1021 entries can fit into a 4 KiB page
  # The difference between address of first and last functions in a second-level page must be less than 2^24. The page will be cut short and remain under-filled when a candidate entry's 24-bit relative offset field would overflow.
  # We do a linear scan and fold adjacent entries that have the same encoding, thus laying-out fewer entries in the output than are present in the input.

As currently implemented, I use `mutable` data members to cache the necessary and substantial work done in `UnwindInfoSection::getSize() const` so that `UnwindInfoSection::writeTo() const` need not redo it.

I believe it is not strictly necessary to relocate the compact unwind entries in `getSize()`. I can probably work with the raw input sections for page partitioning and entry folding, though the code will be trickier, and I won't escape the need to relocate later in `writeTo()` for sake of LSDA and personality addresses.

I chose to use data members as cache, to relocate early, and to use `mutable` to work-around constness of `getSize()` and `writeTo()` because that results in code that does things in the proper sequence, and in the most straightforward manner for sake of clarity and efficiency.

================
Comment at: lld/MachO/UnwindInfo.h:51-66
+  mutable std::map<uint32_t, size_t> commonEncodingIndexes;
+  mutable std::vector<std::pair<compact_unwind_encoding_t, size_t>>
+      commonEncodings;
+  mutable std::vector<uint32_t> personalities;
+  mutable std::vector<unwind_info_section_header_lsda_entry> lsdaEntries;
+  mutable std::vector<compact_unwind_entry> cuVector;
+  mutable std::vector<compact_unwind_entry *> cuPtrVector;
----------------
gkm wrote:
> int3 wrote:
> > int3 wrote:
> > > gkm wrote:
> > > > int3 wrote:
> > > > > why the need to mark everything as mutable?
> > > > The work of morphing `__LD,__compact_unwind` into `__TEXT,__unwind_info` happens in `uint64_t getSize() const` and `void writeTo(uint8_t *buf) const`, which can only be logically `const`, but not physically.
> > > I would much prefer if we could have a method that explicitly computes the unwind info... but first let me see if I understand the constraints here.
> > > 
> > > From what I understand, the size of the unwind info depends on the values of `CompactUnwindEntry64::functionAddress`, or more precisely, their relative offsets. So we need to assign addresses to our functions before computing the unwind info size. After addresses are assigned, the `cuSection->writeTo()` call below will update the `functionAddress` entries to point to the final addresses, after which we can compute the needed size. Does all that sound right?
> > > 
> > > From what I can tell, ld64 calculates unwind info size based on "tentative addresses", which I think have the right relative offsets, though their final values have yet to be fixed. I wonder if we could do something similar with our current design...
> > Also -- do you know if it's possible for unwind info to point to functions that reside in sections other than `__text`?
> Yes, that's substantially correct, and I will elaborate:
> 
> `UnwindInfoSection::getSize()` must process the compact unwind entries and partition them into 4 KiB second-level pages in order to determine the final size of the `__TEXT,__unwind_info`. There are three factors that influence layout & therefore size:
>   # A maximum of 1021 entries can fit into a 4 KiB page
>   # The difference between address of first and last functions in a second-level page must be less than 2^24. The page will be cut short and remain under-filled when a candidate entry's 24-bit relative offset field would overflow.
>   # We do a linear scan and fold adjacent entries that have the same encoding, thus laying-out fewer entries in the output than are present in the input.
> 
> As currently implemented, I use `mutable` data members to cache the necessary and substantial work done in `UnwindInfoSection::getSize() const` so that `UnwindInfoSection::writeTo() const` need not redo it.
> 
> I believe it is not strictly necessary to relocate the compact unwind entries in `getSize()`. I can probably work with the raw input sections for page partitioning and entry folding, though the code will be trickier, and I won't escape the need to relocate later in `writeTo()` for sake of LSDA and personality addresses.
> 
> I chose to use data members as cache, to relocate early, and to use `mutable` to work-around constness of `getSize()` and `writeTo()` because that results in code that does things in the proper sequence, and in the most straightforward manner for sake of clarity and efficiency.
With assembler language, anything is possible. I assume one can place compiled functions in alternate sections via attributes or pragmas. Those sections might be discontiguous with `__text`, and therefore require lay-out in to distinct second-level pages. I don't know if ld64 supports non-`__text` compact unwind. I will need to conjure some test cases to see how it behaves. If ld64 doesn't handle non-`__text`, then we won't either. If it does, then I will add that later.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86805/new/

https://reviews.llvm.org/D86805

[libcxx-commits] [PATCH] D86805: [lld-macho] create __TEXT, __unwind_info from __LD, __compact_unwind

[libcxx-commits] [PATCH] D86805: [lld-macho] create TEXT, unwind_info from LD, compact_unwind