[PATCH] D86805: [lld-macho] create __TEXT,__unwind_info from __LD,__compact_unwind
Jez Ng via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Sep 8 10:56:43 PDT 2020
int3 added a comment.
Having things in `finalize()` indeed looks a lot cleaner :D
================
Comment at: lld/MachO/SyntheticSections.h:22
#include "llvm/Support/raw_ostream.h"
+#include <map>
----------------
this include doesn't seem necessary
================
Comment at: lld/MachO/UnwindInfo.cpp:91-92
+ if (targetIsec->getVA() == 0) {
+ error(Twine("__compact_unwind relocation is unresolved against " +
+ targetIsec->segname + "," + targetIsec->name));
+ }
----------------
is the `Twine()` call here necessary? I would expect the concatenated StringRefs to already be a Twine
================
Comment at: lld/MachO/UnwindInfo.cpp:107
+ cuPtrVector.reserve(cuCount);
+ std::map<compact_unwind_encoding_t, size_t> encodingFrequencies;
+ for (size_t i = 0; i < cuCount;) {
----------------
ditto, avoid `<map>`
================
Comment at: lld/MachO/UnwindInfo.cpp:120
+ // make a table of common encodings, and sort by descending frequency
+ for (const auto &frequency : encodingFrequencies) {
+ commonEncodings.emplace_back(frequency);
----------------
codebase convention is to avoid curly braces for one-liner blocks
================
Comment at: lld/MachO/UnwindInfo.cpp:127
+ if (a.second == b.second)
+ // When freqnecies match, secondarily sort on encoding
+ // to maintain parity with validate-unwind-info.py
----------------
frequencies
================
Comment at: lld/MachO/UnwindInfo.cpp:147
+ // Record the page splits as a vector of iterators on cuPtrVector
+ // such that succesive elements form a semi-open interval. E.g.,
+ // page X's bounds are thus: [ pageBounds[X] .. pageBounds[X+1] )
----------------
successive
================
Comment at: lld/MachO/UnwindInfo.cpp:150
+ //
+ // Note that pageBouinds.size() is one greater than the number of
+ // pages, where pageBounds.cend() holds the sentinel cuPtrVector.cend()
----------------
pageBounds
================
Comment at: lld/MachO/UnwindInfo.cpp:155
+ for (size_t i = 0;;) {
+ // Limit the search to entries that can fit witin a 4 KiB page.
+ std::vector<const CompactUnwindEntry64 *>::const_iterator it0 =
----------------
within
================
Comment at: lld/MachO/UnwindInfo.cpp:156-158
+ std::vector<const CompactUnwindEntry64 *>::const_iterator it0 =
+ pageBounds[0] + i;
+ std::vector<const CompactUnwindEntry64 *>::const_iterator itN =
----------------
nit 1: the type here is pretty verbose, a type alias would be nice (`auto` would be acceptable here too I think)
nit 2: `it0` and `itN` aren't the most descriptive names... something like `intervalStart` and `intervalMax` would be nice
================
Comment at: lld/MachO/UnwindInfo.cpp:167-168
+ std::lower_bound(
+ it0, itN, nullptr,
+ [functionAddressMax](const CompactUnwindEntry64 *a, ...) {
+ return a->functionAddress < functionAddressMax;
----------------
why not pass `functionAddressMax` in as the 3rd parameter to `lower_bound`?
edit: oh, because the return type isn't the container element type. would be good to have a comment :)
================
Comment at: lld/MachO/UnwindInfo.cpp:193-196
+ // relocate __compact_unwind entries into cuBuf
+ if (getSize() == 0)
+ return;
+
----------------
can be rm'ed
================
Comment at: lld/MachO/UnwindInfo.cpp:211
+ // Common encodings
+ auto *i32p = reinterpret_cast<uint32_t *>(&uip[1]);
+ for (const auto &encoding : commonEncodings) {
----------------
would be good to have a comment to the effect that there are a bunch of integer arrays immediately after the section header (I had to pause and think about what `&uip[1]` meant)
================
Comment at: lld/MachO/UnwindInfo.cpp:257
+ for (size_t i = 0; i < pageBounds.size() - 1; i++) {
+ p2p->kind = 3;
+ p2p->entryPageOffset =
----------------
what does a `kind` of 3 indicate?
================
Comment at: lld/MachO/UnwindInfo.h:49
+private:
+ std::map<uint32_t, size_t> commonEncodingIndexes;
+ std::vector<std::pair<compact_unwind_encoding_t, size_t>> commonEncodings;
----------------
try to avoid `<map>` if possible: https://llvm.org/docs/ProgrammersManual.html#map
also the key here should be `compact_unwind_encoding_t` to be consistent :)
================
Comment at: lld/MachO/UnwindInfo.h:50
+ std::map<uint32_t, size_t> commonEncodingIndexes;
+ std::vector<std::pair<compact_unwind_encoding_t, size_t>> commonEncodings;
+ std::vector<uint32_t> personalities;
----------------
nit: I'd prefer structs over pairs for readable field names
================
Comment at: lld/MachO/Writer.cpp:413-414
+ .Case(segment_names::linkEdit, 100)
+ // __LD,__compact_unwind isn't an output segment, so shove it past the end
+ .Case(segment_names::ld, 101)
.Default(0);
----------------
gkm wrote:
> int3 wrote:
> > This seems a bit hacky. Do we really have to put the `__compact_unwind` sections in an OutputSegment? If we're not going to output them, could we just put all these InputSections in a vector contained in `UnwindInfoSection`?
> >
> > Also, we may want to support emitting object files at some point via `-r`, in which case we'll actually want to emit the `__LD` segment at a valid position.
> I agree that it could use some improvement. I will revisit.
>
> Behaviorally, `__LD,__compact_unwind` is a `MergedOutputSegment`, except with different timing and destination: we relocate & write it to a temp buffer from `Writer::assignAddresses()`) rather than writing to the final link output from `Writer::writeSections()`. We need to relocate it fully in order to compute the size of the `__TEXT,__unwind_info` section when laying-out sections & binding to addresses.
>
> You are quite right regarding `-r`, which makes `__LD,__compact_unwind` a completely normal `MergedOutputSegment`.
I think it might be worthwhile to duplicate some of the code in `InputSection::writeTo` to specialize it for the compact unwind case. The current `writeTo` code assumes that both the source and target sections referenced by a relocation have their final output addresses determined. But this is not true for the compact unwind section -- it doesn't have a valid output address. It just happens to work because there are no pcrel relocations involved. We should really error out if we see a pcrel relocation while relocating `__compact_unwind`.
Creating a separate compact unwind relocating method would free us from having to create `Output{Segments, Sections}` that we discard later. I'm hoping we can retain the invariant where we add OutputSections to OutputSegments only after we have determined that they are needed.
================
Comment at: lld/test/MachO/compact-unwind.test:3-5
+# FIXME(gkm): This test is fast on a Release tree, and slow (~10s) on
+# a Debug tree mostly because of llvm-mc. Is there a way to prefer the
+# fast installed llvm-mc rather than the slow one in our Debug tree?
----------------
My understanding of the need for generate-cfi-funcs.py is that some test cases need to be very large in order to exercise the edge cases in the implementation. In addition to providing that functionality, this script also serves as a fuzzer by generating random test cases. Would it be possible for the unit test to instead enumerate a small number of cases that cover all important code paths, in order that we run fewer test cases?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D86805/new/
https://reviews.llvm.org/D86805
More information about the llvm-commits
mailing list