[PATCH] D152921: [lld] Synthesize metadata for MTE globals

Peter Smith via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jun 21 11:15:26 PDT 2023


peter.smith added a comment.

I've gone through all of the LLD changes now.



================
Comment at: lld/ELF/Driver.cpp:2564
+  const RelsOrRelas<ELFT> rels = sec->relsOrRelas<ELFT>();
+  assert(!rels.areRelocsRel() &&
+         "non-RELA relocations are not allowed with memtag globals");
----------------
I think Fuchsia uses REL dynamic relocations with `-z rel` , may be worth making this a user visible error message as this could happen with user input.  


================
Comment at: lld/ELF/Driver.cpp:2915
+    DenseMap<Symbol *, unsigned> taggedSymbolReferenceCount;
+    for (InputFile* file : files) {
+      // TODO(hctim): Add bitcode (LTO) support.
----------------
At this stage can you use ctx.objectFiles this seems to be more idiomatic than processing the files directly.


================
Comment at: lld/ELF/Driver.cpp:2921
+      for (InputSectionBase *section : file->getSections()) {
+        if (section == nullptr) continue;
+        if (section->type != SHT_AARCH64_MEMTAG_GLOBALS_STATIC) continue;
----------------
I think clang-format will prefer
```
  if (section == nullptr)
    continue;
```
It is coming out all as one line on my screen, apologies if that is just Phabricator.
May want to combine the two if statements into one. Also LLD tends to use !section
```
  if (!section || section->type != SHT_AARCH64_MEMTAG_GLOBALS_STATIC)
    continue;
```

Given that this is after Garbage Collection I think you'll need to check if s== &InputSection::discarded.


================
Comment at: lld/ELF/Driver.cpp:2928
+
+    // Now, go through all the symbols. If the number of declarations +
+    // definitions to a symbol exceeds the amount of times they're marked as
----------------
Is it ever possible to get a case where a STB_LOCAL binding symbol definition is incorrectly tagged? For example all references to it should be within the same TU, which should share the same command line options. It is possible some function attribute might prevent this though.

If it isn't possible then you might be able to just iterate the global symbols.


================
Comment at: lld/ELF/Driver.cpp:2939
+      for (Symbol *symbol : file->getSymbols()) {
+        if (symbol == nullptr) continue;
+        auto it = taggedSymbolReferenceCount.find(symbol);
----------------
I'm pretty sure this isn't necessary, I've not seen any other code that checks. I guess it might be possible that a broken object file might contain a 0 index for a symbol, but if it does LLD will already have crashed by now.


================
Comment at: lld/ELF/SyntheticSections.cpp:3898
+// it to `*(buf + offset)` if `buf` is non-null.
+static size_t computeULEB128(uint64_t v, uint8_t *buf, size_t offset) {
+  if (buf)
----------------
Although not ideal either, would writeULEB128 be a better name as compute implies that it is just computing the value and not writing it?

I guess computeOrWrite is the most descriptive but possibly a bit long.


================
Comment at: lld/ELF/SyntheticSections.cpp:3911
+  // happens before the VA is assigned.
+  std::sort(symbols->begin(), symbols->end(),
+            [](const Symbol *s1, const Symbol *s2) {
----------------
Although it may not be worth dealing with, I think you could pull this out into a separate function that you could call after the first address assignment in finalizeAddressDependentContent, as by that time the relative order of the symbols would be fixed.


================
Comment at: lld/ELF/SyntheticSections.cpp:3939
+
+    const uint64_t sizeToEncode = size / kMemtagGranuleSize;
+    const uint64_t stepToEncode = ((vAddr - lastGlobalEnd) / kMemtagGranuleSize) << kMemtagStepSizeBits;
----------------
One possibility would be to have an internal buffer that could cache the results so that writeTo would just need to write it out. I guess the problem is that we won't know how big to make it, so I guess it would need to be a self expanding buffer.

Not sure if it is worth doing, as it might end up more complicated than it is already. 


================
Comment at: lld/ELF/SyntheticSections.cpp:3953
+
+// Some callers can't allow in-place modification of *symbols. In these cases,
+// make a copy, to allow those who are okay with in-place to avoid it.
----------------
By caller's do you mean call-sites?





================
Comment at: lld/ELF/SyntheticSections.cpp:3967
+
+size_t MemtagDescriptors::getSize() const {
+  return createMemtagDescriptors(&symbols);
----------------
At first glance I thought that the contents of the MemtagDescriptors would be stable and could be cached, but I think the uleb encoding could change depending on the addresses so it will need recalculating.

If I'm right it could be worth adding a comment.


================
Comment at: lld/ELF/SyntheticSections.h:1234
+  void addSymbol(const Symbol *sym) {
+    if (sym && sym->isDefined())
+      symbols.push_back(sym);
----------------
Looks like the call site has this check before calling addSymbol. Is it worth making that an assert instead?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152921/new/

https://reviews.llvm.org/D152921



More information about the llvm-commits mailing list