[PATCH] D77893: [lld] Merge Mach-O input sections

Kellie Medlin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Apr 30 23:45:49 PDT 2020


Ktwu marked 11 inline comments as done.
Ktwu added inline comments.


================
Comment at: lld/MachO/MergedOutputSection.h:19
+
+/**
+ * Linking multiple files will inevitably mean resolving sections in different
----------------
smeenai wrote:
> The comment looks great!
> 
> Super nit: LLD always formats these types of comments using single-line comments (`//`) instead of `/* */`, so we should follow suit here.
D:


================
Comment at: lld/MachO/OutputSection.cpp:22
+void OutputSection::mergeInput(InputSection *input) {
+  error("Section " + name + " in segment " + parent->name +
+        " cannot be merged");
----------------
smeenai wrote:
> We should only hit this if we have a programming error on our end, right? If so, this should be `llvm_unreachable` instead of `error`.
Yup, since we're assuming that synthetic sections cannot be merged, this shouldn't happen.


================
Comment at: lld/MachO/OutputSegment.h:73
+    if (it == orderMap.end()) {
+      return &orderMap.find(defaultPosition)->second;
+    }
----------------
smeenai wrote:
> Ktwu wrote:
> > smeenai wrote:
> > > Given that this might be the common case, would it make sense to cache the `find` somehow?
> > Good idea!
> This one isn't addressed, but given that we shouldn't have too many segments (besides the ones already in the map, I can only think of `__DATA`), perhaps this is okay as-is?
Ah, no, I legit forgot to address this, so I don't mind getting to it...


================
Comment at: lld/MachO/SyntheticSections.cpp:70
 
-GotSection::GotSection() {
-  segname = "__DATA_CONST";
-  name = "__got";
+GotSection::GotSection() : SyntheticSection("__DATA_CONST", "__got") {
   align = 8;
----------------
int3 wrote:
> smeenai wrote:
> > @int3 how come these strings are just here directly vs. all the other ones being named constants?
> > 
> > Also, not this diff, but is `__DATA_CONST` correct? ld64 puts this in `__DATA` instead as far as I can see. It's not quite constant cos the dynamic linker's gonna fill it in, though idk if it has the equivalent to ELF's RELRO.
> I don't think we're currently referencing this from anywhere else (in particular I didn't give them an order in the sorting comparator), so it wasn't technically necessary, though we could definitely make them constants for the sake of uniformity
I'll make 'em a constant for now; can we deal with __DATA vs __DATA_CONST in another diff if need be?


================
Comment at: lld/test/MachO/section-merge.s:15
+# CHECK:          Name: _goodbye_world
+# CHECK-NEXT:     Extern
+# CHECK-NEXT:     Type: Section (0xE)
----------------
int3 wrote:
> smeenai wrote:
> > We only need to check the properties we care about. Detailed symbol table checking should happen in the symbol table tests. Over here, I think we just care about the symbol name, section, and value, so we can drop the checks for the other fields (Extern, Type, RefType, Flags)
> > 
> > We should also be checking for the text section symbols.
> +1 for terser checks. We can also use `llvm-objdump --syms` here -- its output is much more compact and suitable for when we're not checking all the properties
llvm-objdump --syms is pretty nice, so I'll use that here instead.


================
Comment at: lld/test/MachO/section-merge.s:45
+# DATA:           SectionData (
+# DATA-NEXT:        0000: 48C7C001 000000C3 48C7C000 000000C3
+# DATA-NEXT:      )
----------------
smeenai wrote:
> Ktwu wrote:
> > smeenai wrote:
> > > Might be easier to make sense of this as assembly (`llvm-objdump -d`)
> > Is this a big deal? I compared this output to ld; I don't care about the content so much as making sure it matches what ld outputs.
> I think it makes the test a lot more intelligible and easy to verify. Right now, for me, this is just a blob of bytes. If it were written out as instructions, I could verify that it's the instructions in `_some_function` followed by the instructions in `_main` (as it should be).
> 
> In general, matching ld64's output is important, but we also want our tests to work well standalone. The cstring check below is great because it's easy to tell at a glance that all the strings from the various input files are being combined together (as they should be); using the disassembly will let us do the same for the text section.
Fair enough!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77893/new/

https://reviews.llvm.org/D77893





More information about the llvm-commits mailing list