[PATCH] D116787: [llvm-readobj][MachO] Add option to sort the symbol table before dumping (MachO only, for now).

James Henderson via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Feb 25 02:40:21 PST 2022


jhenderson added a comment.

I feel like there's an awful lot more code than is required for this patch. I stopped commenting on inidividual things partway through. I think the following outline should be sufficient. It probably isn't 100% correct, but you should be able to get the general idea from it.

  // This code could all live in generic area, since this is generic behaviour.
  bool compareSymName(SymbolRef LHS, SymbolRef RHS) {
    // Implementation left as an exercise for the reader. In essence:
    // return LHS.Name < RHS.Name
  }
  
  bool compareSymType(SymbolRef LHS, SymbolRef RHS) {
    // Implementation left as an exercise for the reader. In essence:
    // return LHS.Type < RHS.Type
  }
  
  class SymbolComparer {
  public:
    using ComparePred = function_ref<bool(SymbolRef, SymbolRef)>;
    void add(ComparePred Pred) { Predicates.push_back(Pred); }
  
    bool operator()(SymbolRef LHS, SymbolRef RHS) {
      for(ComparePred Pred : Predicates) {
        if (Pred(LHS, RHS))
          return true;
        if (Pred(RHS, LHS))
          return false;
      }
      // All considered parameters are equal. This means that a SymbolComparer
      // taking an empty vector in the constructor will treat all symbols as equal.
      return false;
    }
  
  private:
    SmallVector<ComparePred, 2> Predicates;
  };
  
  // Code in MachODumper.cpp, possibly even other files too.
  void MachODumper::printSymbols(const SymbolComparer &SymCmp) {
    ListScope Group(W, "Symbols");
    auto SymbolRange = Obj->symbol();
    std::vector<SymbolRef> SortedSymbols(SymbolRange.begin(), SymbolRange.end());
    stable_sort(SortedSymbols, SymCmp);
    for (SymbolRef Symbol : SortedSymbols)
      printSymbol(Symbol);
  }
  
  // In main
  SymbolComparer SymCmp;
  if (Arg *A = Args.getLastArg(OPT_sort_symbols_EQ)) {
    const StringMap <SymbolComparer::ComparePred> KeyToPredMap =
      {{"name", compareSymName}, {"type", compareSymType}};
    for (StringRef KeyStr : llvm::split(A->getValue(), ",")) {
      auto Found = KeyToPredMap.find(KeyStr);
      if(Found == KeyToPredMap.end())
        error("--sort-symbols value should be 'name' or 'type', but was '" +
              Twine(KeyStr) + "'");
      else
        SymCmp.add(Found->getValue());
    }
  }



================
Comment at: llvm/docs/CommandGuide/llvm-readobj.rst:112
+ Specify the keys to sort symbols before displaying symtab.
+ Valid values for sort_key are ``name`` and ``type``
+
----------------



================
Comment at: llvm/test/tools/llvm-readobj/MachO/stabs-sorted.yaml:9
+# RUN: llvm-readobj --syms --sort-symbols=type,name %t | FileCheck %s --strict-whitespace --match-full-lines --check-prefix TYPE-NAME
+# RUN: llvm-readobj --syms --sort-symbols=name,type %t | FileCheck %s --strict-whitespace --match-full-lines --check-prefix NAME-TYPE
+
----------------
Might be worth another case where just one of these values is specified, showing what happens in this case (e.g. just type means identical type symbols are left in their original order, or something like that).

As this test case is about whether or not the symbols are sorted, and not the formatting of the output, I'd remove the `--strict-whitespace` and --match-full-lines` options.


================
Comment at: llvm/test/tools/llvm-readobj/MachO/stabs-sorted.yaml:18-21
+# TYPE-NAME:File: {{.+}}
+# TYPE-NAME-NEXT:Format: Mach-O {{.+}}
+# TYPE-NAME-NEXT:Arch: {{.+}}
+# TYPE-NAME-NEXT:AddressSize: {{.+}}bit
----------------
I don't think you need this block here.


================
Comment at: llvm/test/tools/llvm-readobj/MachO/stabs-sorted.yaml:23-31
+# TYPE-NAME-NEXT:  Symbol {
+# TYPE-NAME-NEXT:    Name: _a (19)
+# TYPE-NAME-NEXT:    Type: Section (0xE)
+# TYPE-NAME-NEXT:    Section: __data (0x3)
+# TYPE-NAME-NEXT:    RefType: UndefinedNonLazy (0x0)
+# TYPE-NAME-NEXT:    Flags [ (0x0)
+# TYPE-NAME-NEXT:    ]
----------------
Related to my above comment re. formatting, we don't need to show that the entire symbol printing is correct - that's the responsibility of a test that is testing the `--symbols` option rather than the `--sort-symbols` option. Instead, I'd just check the name and type fields. Something like this:

```
TYPE-NAME:      Name: _a
TYPE-NAME-NEXT: Type: Section
TYPE-NAME:      Name: _d
TYPE-NAME-NEXT: Type: Section
```


================
Comment at: llvm/tools/llvm-readobj/MachODumper.cpp:88
 const EnumEntry<uint32_t> MachOMagics[] = {
-  { "Magic",      MachO::MH_MAGIC    },
-  { "Cigam",      MachO::MH_CIGAM    },
-  { "Magic64",    MachO::MH_MAGIC_64 },
-  { "Cigam64",    MachO::MH_CIGAM_64 },
-  { "FatMagic",   MachO::FAT_MAGIC   },
-  { "FatCigam",   MachO::FAT_CIGAM   },
+    {"Magic", MachO::MH_MAGIC},      {"Cigam", MachO::MH_CIGAM},
+    {"Magic64", MachO::MH_MAGIC_64}, {"Cigam64", MachO::MH_CIGAM_64},
----------------
Please only reformat the parts of the file that you've changed.


================
Comment at: llvm/tools/llvm-readobj/MachODumper.cpp:607-610
+using ExtractedSymbolInfo = std::tuple<uint8_t, std::string, std::string>;
+static constexpr int TYPE_IDX = 0;
+static constexpr int NAME_IDX = 1;
+static constexpr int STRINGREP_IDX = 2;
----------------
Just use a `struct` rather than a tuple. It'll be easier to reason with and you won't need these constants (which should be `size_t` anyway, since they're indexes).


================
Comment at: llvm/tools/llvm-readobj/MachODumper.cpp:680
+  if (SortKeys && !SortKeys->empty()) {
+    // The references returned by calling Obj->symbols() are temporary,
+    // and we can't hold onto them after the loop. In order to sort the
----------------
Are you sure about this? They don't look all that temporary (as long as the object file is still in memory)...


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116787/new/

https://reviews.llvm.org/D116787



More information about the llvm-commits mailing list