[PATCH] D128108: [lld-macho] Add support for objc_msgSend stubs

Jez Ng via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Aug 10 03:04:59 PDT 2022


int3 accepted this revision.
int3 added a comment.
This revision is now accepted and ready to land.

lgtm, thanks!



================
Comment at: lld/MachO/Arch/ARM64.cpp:114-116
+    0xd4200020, // brk   #0
+    0xd4200020, // brk   #0
+    0xd4200020, // brk   #0
----------------
keith wrote:
> int3 wrote:
> > What's the purpose of these `brk #0` instructions? Also, any idea why they are rendered as `brk #0x1` in the `arm64-objcstubs.s` test?
> The comment here was outdated, fixed to show `0x1`. I'm not sure why there are extras here, maybe some alignment concern? I wasn't really sure if the right path is to copy ld64, or just ignore unless we hit an issue. I haven't tested without them.
ah gotcha. yeah let's just emulate ld64


================
Comment at: lld/MachO/SyntheticSections.cpp:726
+
+void ObjCStubsSection::addEntry(Symbol *sym) {
+  // Ensure our lookup string has the length of the actual string + the null
----------------
how about asserting that the symbol name `startswith()` the symbol prefix here, just for clarity?


================
Comment at: lld/MachO/SyntheticSections.cpp:1637
+  assert(offset != stringOffsetMap.end() &&
+         "Looked up strings should always existed in section");
+  return offset->second;
----------------



================
Comment at: lld/MachO/SyntheticSections.cpp:731
+  std::string methname = sym->getName().drop_front(symbolPrefixLength).str();
+  methname.push_back('\0');
+  auto map = in.objcMethnameSection->getDuplicateStringOffsetMap();
----------------
keith wrote:
> int3 wrote:
> > keith wrote:
> > > int3 wrote:
> > > > isn't the string already null-terminated? why do we have to add another `\0`?
> > > I didn't dig into this, but this is how I got here:
> > > 
> > > In `CStringSection`:
> > > 
> > > ```
> > >       StringRef string = isec->getStringRef(i);
> > >       llvm::outs() << "'" << string << "'" << " len: " << string.size() << "\n";
> > > ```
> > > 
> > > This prints: `'foo' size 4`.
> > > 
> > > In `ObjCStubsSection::addEntry`
> > > 
> > > ```
> > >   llvm::outs() << "'" << sym->getName() << "' size " << sym->getName().size() << "\n";
> > >   std::string methname = sym->getName().drop_front(symbolPrefixLength).str();
> > > ```
> > > 
> > > This prints `'foo' size 3`.
> > could you dig into this? I would really prefer we avoid copying the string if possible
> I pushed a new option here which I believe avoids the copies. The core issue is that here https://github.com/llvm/llvm-project/blob/1b349bdaaa540e97d90318c3706527f6ca084987/lld/MachO/InputSection.h#L206-L212 we manually construct a StringRef with the actual data, and with the length of the actual data + 1, since `pieces[i + 1].inSecOff` is after the null terminator. But any other StringRef initialization, like this one https://github.com/llvm/llvm-project/blob/1b349bdaaa540e97d90318c3706527f6ca084987/lld/MachO/InputFiles.cpp#L943 goes through an initializer that computes the length, which naturally excludes the null terminator https://github.com/llvm/llvm-project/blob/1b349bdaaa540e97d90318c3706527f6ca084987/llvm/include/llvm/ADT/StringRef.h#L83-L85
> 
> I'm now mirroring this StringRef initialization, but maybe we should correct that string length off by 1 instead?
ooh. Yeah we should fix that off-by-one. I'm fine if you want to do it in a separate commit though


================
Comment at: lld/test/MachO/arm64-objc-stubs.s:40-43
+selref1:
+  .asciz  "foo"
+selref2:
+  .asciz  "bar"
----------------
make these symbols private `lselref*`s too?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D128108/new/

https://reviews.llvm.org/D128108



More information about the llvm-commits mailing list