[PATCH] D103006: [lld][MachO] Add support for LC_DATA_IN_CODE

Jez Ng via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon May 24 15:27:51 PDT 2021


int3 added inline comments.


================
Comment at: lld/MachO/SyntheticSections.cpp:611
+
+    for (size_t i = 0, n = sections.size(); i < n; ++i) {
+      const SubsectionMap &subsecMap = objFile->subsections[i];
----------------
this loop could do with a comment explaining how we are adjusting the offsets


================
Comment at: lld/MachO/SyntheticSections.cpp:619-620
+        const uint64_t endAddr = beginAddr + isec->getFileSize();
+        while (it != end && it->offset < beginAddr)
+          ++it;
+        for (; it != end && it->offset + it->length <= endAddr; ++it)
----------------
Can this be a binary search?


================
Comment at: lld/MachO/SyntheticSections.cpp:624
+              {static_cast<uint32_t>(it->offset - beginAddr +
+                                     isec->getFileOffset()),
+               it->length, it->kind});
----------------
shouldn't it be `getOffset()`? we shouldn't be mixing addresses and file offsets in the same calculation... the two can diverge (since zerofill sections have zero file size but occupy a nonzero address range). Sometimes they line up but they should be treated as effectively different types.

also we did discuss that DIC offsets probably only make sense if you assume that there's exactly one code section, but the loop here attempts to handle the case of multiple code sections. I don't think this offset calculation makes sense if there are multiple sections, since the offset is relative to the start of the OutputSection, but there's nothing in each DIC entry to say which section it belongs to. What does ld64 do here -- does it calculate offsets from the first code section?


================
Comment at: lld/MachO/SyntheticSections.cpp:625
+                                     isec->getFileOffset()),
+               it->length, it->kind});
+      }
----------------
is it possible for a DIC entry to span more than one subsection? I.e. will we ever need to update the length as well?

(if ld64 doesn't emit a reasonable answer for this case, we don't have to either)


================
Comment at: lld/MachO/SyntheticSections.h:380
 
+class DataInCodeSection : public LinkEditSection {
+public:
----------------
would be good to have a comment describing what the DIC section is about


================
Comment at: lld/test/MachO/data-in-code.s:6
+# RUN: llvm-mc -filetype=obj -triple=x86_64-apple-darwin %t/basic.s -o %t/basic.o
+# RUN: %lld %t/basic.o -o %t/basic
+# RUN: llvm-objdump --private-headers %t/basic > %t/objdump
----------------
would be good to link in a second file here that also has its own text section. That way we can check that the offset adjustments are being done correctly when `__text` gets shifted to a non-zero offset in the output


================
Comment at: lld/test/MachO/data-in-code.s:7
+# RUN: %lld %t/basic.o -o %t/basic
+# RUN: llvm-objdump --private-headers %t/basic > %t/objdump
+# RUN: llvm-objdump --macho --data-in-code %t/basic >> %t/objdump
----------------
if you just need this to capture the address of `__text`, `llvm-objdump --macho --section-headers` produces terser output


================
Comment at: lld/test/MachO/data-in-code.s:22
+
+# BASIC-LABEL:  Data in code table (1 entries)
+# BASIC-NEXT:   offset length kind
----------------
would be nice to have more than one entry


================
Comment at: lld/test/MachO/data-in-code.s:40
+#--- basic.s
+.section __TEXT,__text,regular,pure_instructions
+.globl _main
----------------
I believe this is equivalent


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103006/new/

https://reviews.llvm.org/D103006



More information about the llvm-commits mailing list