[PATCH] D76839: [lld-macho] Extend SyntheticSections to cover all segment load commands

Shoaib Meenai via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Apr 8 21:35:02 PDT 2020


smeenai added a comment.

Nice, I *really* like this design of representing all these special segments as synthetic sections.

@ruiu, how does this look to you now?



================
Comment at: lld/MachO/InputSection.h:38
+  uint64_t getFileOffset() const;
+  // Whether to emit a section_64 header for this section.
+  virtual bool isHidden() const { return false; }
----------------
isHidden implies you *don't* emit the header, right? I'd clarify that in the comment.


================
Comment at: lld/MachO/InputSection.h:40
+  virtual bool isHidden() const { return false; }
+  // Whether to omit this section entirely (header and body).
+  virtual bool isNeeded() const { return true; }
----------------
Same here, since you *wouldn't* omit it if isNeeded.


================
Comment at: lld/MachO/OutputSegment.cpp:22
+  if (name == segment_names::text)
+    return VM_PROT_READ | VM_PROT_EXECUTE;
+  if (name == segment_names::pageZero)
----------------
Same comment here about logic based on section names, though I'm not sure if doing it differently would end up cleaner.


================
Comment at: lld/MachO/OutputSegment.cpp:40
+  if (vec.empty() && !isec->isHidden()) {
+    ++numNonHiddenSections;
+  }
----------------
Hmm, what's up with the `vec.empty()` condition? Doesn't that mean `numNonHiddenSections` will only ever be 0 or 1, depending on if the first section `addSection` was called on for a particular segment was hidden or not?


================
Comment at: lld/MachO/OutputSegment.h:35
+  bool isNeeded() const {
+    return sections.size() != 0 || name == segment_names::linkEdit;
+  }
----------------
ruiu wrote:
> nit: `!sections.empty()` is more conventional.
Looks like this one still needs to be addressed?


================
Comment at: lld/MachO/OutputSegment.h:35
+  bool isNeeded() const {
+    return sections.size() != 0 || name == segment_names::linkEdit;
+  }
----------------
ld64 commonly checks segment or section names and takes particular actions based on that name. I'm not a fan of that design, personally, since it ties a lot of logic to those names. Having named constants for the names is a big improvement already, but I'm wondering if there's any way to design this that would sidestep the issue altogether.

For example, we could have a property of an OutputSegment like `alwaysNeeded`, which would be set for the linkEdit segment at its point of creation. Alternatively, we could have a LinkEditSegment override of OutputSegment which overrides `isNeeded`. The point would be to try to represent properties of the segment as attributes or subclasses, instead of computing them based on the name.

I'm not sure how that'd end up looking in practice though, or if it's worth it in all cases. @int3, @ruiu, what do you think?


================
Comment at: lld/MachO/SyntheticSections.cpp:157
+uint32_t StringPoolSection::addString(StringRef str) {
+  uint32_t strx = poolSize;
+  pool.push_back(str);
----------------
`index` might be clearer than `strx`.


================
Comment at: lld/MachO/SyntheticSections.h:35
+// The header of the Mach-O file, which must have a file offset of zero.
+class MachHeaderSection : public InputSection {
+public:
----------------
Shouldn't all of these be inheriting from SyntheticSection?


================
Comment at: lld/MachO/SyntheticSections.h:116
+// Stores the strings referenced by the symbol table.
+class StringPoolSection : public InputSection {
+public:
----------------
How come you're calling this the StringPoolSection instead of the StringTableSection? I thought string table was the standard terminology.


================
Comment at: lld/MachO/Writer.cpp:259-261
+    return linkEditOffset + 1;
+  else
+    return linkEditOffset;
----------------
ruiu wrote:
> int3 wrote:
> > ruiu wrote:
> > > So it looks like the function returns only 4 possible values -- 0, 1, linkEditOffset, or linkEditOffset+1. Can't you use 0, 1, 2, 3 instead?
> > The idea is to make it easily extensible -- there may be other special sections that we want to order before the __LINKEDIT sections.
> Got it, but I don't think you have to prepare this tiny function for future extension. You can just make it return 0,1,2,3 and then in a later change when you actually need to use linkEditOffset, you can start returning linkEditOffset and linkEditOffset+1.
Looks like this comment still needs a response.


================
Comment at: lld/MachO/Writer.cpp:127
+        c->filesize += isec->getFileSize();
+      if (sections[0]->isHidden()) {
+        continue;
----------------
Nit: drop the braces.


================
Comment at: lld/MachO/Writer.h:17
 
+class LoadCommand {
+public:
----------------
int3 wrote:
> Had to move this to the header so that MachHeaderSection's implementation in SyntheticSections.cpp could access it. That said, I'm a bit undecided on whether it really makes sense for things like `MachHeaderSection` to be there -- while it technically is a synthetic section, if it only gets used by one component of the linker (i.e. in Writer.cpp), maybe it should be defined there?
I think either design is justifiable ... we can just leave them where they are for now and see how things shake out when we have more of the linker implemented.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76839/new/

https://reviews.llvm.org/D76839





More information about the llvm-commits mailing list