[PATCH] D152921: [lld] Synthesize metadata for MTE globals

Mitch Phillips via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jun 29 06:02:40 PDT 2023


hctim marked 16 inline comments as done.
hctim added inline comments.


================
Comment at: lld/ELF/Arch/AArch64.cpp:668
   Symbol &sym = *adrpRel.sym;
+  // Tagged symbols have upper address bits that are added by the dynamic
+  // loader, and thus need the full 64-bit GOT entry. Do not relax such symbols.
----------------
peter.smith wrote:
> IIUC this will change an ADRP, ADD into an ADR, which is just a shorter ranged ADRP/ADD. This relaxation doesn't change the destination address like tryRelaxAdrpLdr might.
> 
> . 
yep, good point, removed


================
Comment at: lld/ELF/Arch/AArch64.cpp:780
     switch (rel.expr) {
     case R_AARCH64_GOT_PAGE_PC:
       if (i + 1 < size &&
----------------
peter.smith wrote:
> To help prevent future relaxations from not considering MTE, It may be worth moving the sym.isTagged() to here with a general condition similar to needsGot in relocations.cpp.
> 
> For example don't relax if the symbol is tagged and is relative to a GOT reference.
> 
> Alternatively could be worth a comment in the Relaxer class to highlight the restriction.
Done, moved it to a general conditional branch.


================
Comment at: lld/ELF/Driver.cpp:354
-  
+
   if (config->tocOptimize && config->emachine != EM_PPC64)
     error("--toc-optimize is only supported on PowerPC64 targets");
----------------
peter.smith wrote:
> Not sure what this difference is from. If it has come from clang-format it may be worth NFC committing it separately to avoid it getting associated with this patch.
ack, removed from diff


================
Comment at: lld/ELF/Driver.cpp:757
 
-  if (!config->androidMemtagHeap && !config->androidMemtagStack) {
-    error("when using --android-memtag-mode, at least one of "
----------------
peter.smith wrote:
> To check my understanding; this is being deleted because protecting globals is independent of heap and stack tagging.
Yeah, that's correct. We want to allow `--android-memtag-mode=sync` without requiring heap/stack, as we want to enable building globals-only binaries.


================
Comment at: lld/ELF/Driver.cpp:2928
+
+    // Now, go through all the symbols. If the number of declarations +
+    // definitions to a symbol exceeds the amount of times they're marked as
----------------
peter.smith wrote:
> Is it ever possible to get a case where a STB_LOCAL binding symbol definition is incorrectly tagged? For example all references to it should be within the same TU, which should share the same command line options. It is possible some function attribute might prevent this though.
> 
> If it isn't possible then you might be able to just iterate the global symbols.
Ooh, nice tip! I've made `addTaggedSymbolReferences` mark them immediately as tagged.


================
Comment at: lld/ELF/Driver.cpp:2939
+      for (Symbol *symbol : file->getSymbols()) {
+        if (symbol == nullptr) continue;
+        auto it = taggedSymbolReferenceCount.find(symbol);
----------------
peter.smith wrote:
> I'm pretty sure this isn't necessary, I've not seen any other code that checks. I guess it might be possible that a broken object file might contain a 0 index for a symbol, but if it does LLD will already have crashed by now.
Done. I swear I added this for a reason, but I'll rebuild Android and check.


================
Comment at: lld/ELF/SyntheticSections.cpp:3898
+// it to `*(buf + offset)` if `buf` is non-null.
+static size_t computeULEB128(uint64_t v, uint8_t *buf, size_t offset) {
+  if (buf)
----------------
peter.smith wrote:
> Although not ideal either, would writeULEB128 be a better name as compute implies that it is just computing the value and not writing it?
> 
> I guess computeOrWrite is the most descriptive but possibly a bit long.
let's go with `computeOrWriteULEB128`


================
Comment at: lld/ELF/SyntheticSections.cpp:3911
+  // happens before the VA is assigned.
+  std::sort(symbols->begin(), symbols->end(),
+            [](const Symbol *s1, const Symbol *s2) {
----------------
peter.smith wrote:
> Although it may not be worth dealing with, I think you could pull this out into a separate function that you could call after the first address assignment in finalizeAddressDependentContent, as by that time the relative order of the symbols would be fixed.
done (glued it up inside finalizeAddressDependentContent)


================
Comment at: lld/ELF/SyntheticSections.cpp:3939
+
+    const uint64_t sizeToEncode = size / kMemtagGranuleSize;
+    const uint64_t stepToEncode = ((vAddr - lastGlobalEnd) / kMemtagGranuleSize) << kMemtagStepSizeBits;
----------------
peter.smith wrote:
> One possibility would be to have an internal buffer that could cache the results so that writeTo would just need to write it out. I guess the problem is that we won't know how big to make it, so I guess it would need to be a self expanding buffer.
> 
> Not sure if it is worth doing, as it might end up more complicated than it is already. 
now that we're not sorting on the size call (and only doing it in the finalization of the address dependent content), less worth doing anything this complicated.


================
Comment at: lld/ELF/SyntheticSections.cpp:3953
+
+// Some callers can't allow in-place modification of *symbols. In these cases,
+// make a copy, to allow those who are okay with in-place to avoid it.
----------------
peter.smith wrote:
> By caller's do you mean call-sites?
> 
> 
> 
this is gone with the new glue around finalizeAddressDependentContent


================
Comment at: lld/ELF/SyntheticSections.cpp:3967
+
+size_t MemtagDescriptors::getSize() const {
+  return createMemtagDescriptors(&symbols);
----------------
peter.smith wrote:
> At first glance I thought that the contents of the MemtagDescriptors would be stable and could be cached, but I think the uleb encoding could change depending on the addresses so it will need recalculating.
> 
> If I'm right it could be worth adding a comment.
done, in the header


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152921/new/

https://reviews.llvm.org/D152921



More information about the llvm-commits mailing list