[PATCH] D131589: [llvm-objdump] Handle multiple syms at same addr in disassembly.

James Henderson via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Aug 11 01:30:51 PDT 2022


jhenderson added a comment.

When adding new tool options, please make sure to update the CommandGuide at llvm/docs/CommandGuide for the tool.



================
Comment at: llvm/tools/llvm-objdump/llvm-objdump.cpp:1494-1497
+      {
+        while (SI != SE && Symbols[SI].Addr == Start)
+          SI++;
+        SymbolsHere = ArrayRef<SymbolInfoTy>(&Symbols[FirstSI], SI - FirstSI);
----------------
rafauler wrote:
> Curious about the style here where we define a scope to isolate a piece of computation. I don't think I've seen this in other LLVM files or in the coding style, so I'm curious what other people think about it.
> 
> I do think it makes the code more clear to understand here, because it's only 3 lines. But on the other hand, it increases nesting on line 1509, which can actually make code harder to read (arguably).
I'm not sure this style is used in LLVM, or at least not in the areas I'm familiar with, so I'd drop it.


================
Comment at: llvm/tools/llvm-objdump/llvm-objdump.cpp:1496
+        while (SI != SE && Symbols[SI].Addr == Start)
+          SI++;
+        SymbolsHere = ArrayRef<SymbolInfoTy>(&Symbols[FirstSI], SI - FirstSI);
----------------
Nit: I think we prefer preincrement to post.


================
Comment at: llvm/tools/llvm-objdump/llvm-objdump.cpp:1501
+      // Get the demangled names of all those symbols.
+      std::vector<std::string> SymNamesHere;
+      for (const SymbolInfoTy &Symbol : SymbolsHere) {
----------------
Can this be a vector of `StringRef` to save the copying?


================
Comment at: llvm/tools/llvm-objdump/llvm-objdump.cpp:1511
+        // If the user has given the --disassemble-symbols option, then we must
+        // display every symbol in that set (that we can find at all), and no
+        // others.
----------------
"(that we can find at all)" - I'm not sure what this is saying. Do you even need it?


================
Comment at: llvm/tools/llvm-objdump/llvm-objdump.cpp:1533
+
+        // Now that we know we're disassembling this section at all, override
+        // the choice of which symbols to display by printing _all_ of them a
----------------



================
Comment at: llvm/tools/llvm-objdump/llvm-objdump.cpp:1534
+        // Now that we know we're disassembling this section at all, override
+        // the choice of which symbols to display by printing _all_ of them a
+        // this address if the user asked for all symbols.
----------------



================
Comment at: llvm/tools/llvm-objdump/llvm-objdump.cpp:1537-1540
+        // (That way, '--show-all-symbols --disassemble-symbol=foo' will print
+        // only the //chunk of code// headed by 'foo', but also show any other
+        // symbols defined at that address, such as aliases for 'foo', or the
+        // Arm mapping symbol preceding its code.)
----------------
Not sure you need the parentheses, or the "//" emphasis marks. Also, "Arm" should be "ARM".


================
Comment at: llvm/tools/llvm-objdump/llvm-objdump.cpp:1542
+        if (ShowAllSymbols) {
+          for (unsigned i = 0; i < SymbolsHere.size(); ++i)
+            SymsToPrint[i] = true;
----------------
`size_t` would be the more natural type for loops over `SymbolsHere`, since that's the value returned by `size()` and used by the index operator.

Ditto below.


================
Comment at: llvm/tools/llvm-objdump/llvm-objdump.cpp:1556-1557
       uint64_t End = std::min<uint64_t>(SectionAddr + SectSize, StopAddress);
-      if (SI + 1 < SE)
-        End = std::min(End, Symbols[SI + 1].Addr);
+      if (SI < SE)
+        End = std::min(End, Symbols[SI].Addr);
       if (Start >= End || End <= StartAddress)
----------------
It's not immediately obvious to me why this line has changed. Could you explain please, as it likely means I've missed something.


================
Comment at: llvm/tools/llvm-objdump/llvm-objdump.cpp:1578
+        const SymbolInfoTy &Symbol = SymbolsHere[i];
+        const std::string &SymbolName = SymNamesHere[i];
+
----------------
`StringRef`?


================
Comment at: llvm/tools/llvm-objdump/llvm-objdump.cpp:1655
       if (Obj.isELF() && !DisassembleAll && Section.isText()) {
-        uint8_t SymTy = Symbols[SI].Type;
-        if (SymTy == ELF::STT_OBJECT || SymTy == ELF::STT_COMMON) {
-          dumpELFData(SectionAddr, Index, End, Bytes);
-          Index = End;
+        for (const SymbolInfoTy &Symbol : SymbolsHere) {
+          uint8_t SymTy = Symbol.Type;
----------------
Is there a subtle behaviour change here if you have multiple symbols at the same address but different types (i.e. one is STT_OBJECT and one isn't, e.g. STT_FUNC)?


================
Comment at: llvm/tools/llvm-objdump/llvm-objdump.cpp:1665-1671
+      bool CheckARMELFData = false;
+      if (hasMappingSymbols(Obj) && !DisassembleAll) {
+        CheckARMELFData = true;
+        for (const SymbolInfoTy &Symbol : SymbolsHere)
+          if (Symbol.Type == ELF::STT_OBJECT)
+            CheckARMELFData = false;
+      }
----------------
Same comment as above.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D131589/new/

https://reviews.llvm.org/D131589



More information about the llvm-commits mailing list