[PATCH] D120357: [llvm-nm]add helper function to print out the object file name, archive name, architecture name

Wed Mar 2 07:22:28 PST 2022

DiggerLin added inline comments.

================
Comment at: llvm/test/Object/nm-universal-binary.test:38

-CHECK-AR: macho-universal-archive.x86_64.i386(hello.o) (for architecture x86_64):
+CHECK-AR: {{[[:space:]].*}}macho-universal-archive.x86_64.i386(hello.o) (for architecture x86_64):
 CHECK-AR: 0000000000000068 s EH_frame0
----------------
jhenderson wrote:
> Adding `{{[[:space:]].*}}` does literally nothing useful, as the whitespace at the start of a line is ignored unless using both `--strict-whitespace` and `--match-full-lines` in the FileCheck command. I believe, if you want to check the first line is empty, that you can use:
> 
> ```
> CHECK-AR:      {{^$}}
> CHECK-AR-NEXT: macho-universal-archive.x86_64.i386(hello.o) (for architecture x86_64):
> ```
> 
> For later blank lines, you can do `CHECK-EMPTY:` immediately following a regular `CHECK`. `CHECK-EMPTY` works like `CHECK-NEXT` but ensures the line is completely blank instead of checking against some pattern. Since it works like `CHECK-NEXT` it cannot be used as the first `CHECK` pattern however. Conversely, `{{^$}}` will not always work if used after the first `CHECK` pattern. This is because FileCheck consumes the string it has matched up to before running the next `CHECK`. For example, imagine the output was:
> ```
> foo
> bar
> ```
> And we have the following CHECKs:
> ```
> CHECK: foo
> CHECK-NEXT: {{^$}}
> CHECK-NEXT: bar
> ```
> to test that there is a blank line between foo and bar. In this case, after matching `foo`, the output will look like a blank line, followed by a line containing bar. The attempt to check for an empty line matches the first line (at the end of foo), which is not the line after the previous match, so the CHECK-NEXT fails.
> 
> (Aside, not something for you to fix, but this test is full of holes, and could fail to catch symbols in the output that shouldn't be present, or other additional lines).
thanks a lot  for your detail explanation.

================
Comment at: llvm/tools/llvm-nm/llvm-nm.cpp:1889
+  CurrentFilename = Obj.getFileName();
+  if (IsSymbolsEmpty && SymbolList.empty() && !Quiet) {
     writeFileName(errs(), ArchiveName, ArchitectureName);
----------------
jhenderson wrote:
> DiggerLin wrote:
> > jhenderson wrote:
> > > Why have you just introduced this change? How is it better than the previous code?
> > yes, in the original code of function getSymbolNamesFromObject( )
> > 
> > ```
> >   auto Symbols = Obj.symbols();
> >   .....
> >  
> >   if (DynamicSyms) {
> >     const auto *E = dyn_cast<ELFObjectFileBase>(&Obj);
> >     if (!E) {
> >       error("File format has no dynamic symbol table", Obj.getFileName());
> >       return;
> >     }
> >   Symbols = E->getDynamicSymbolIterators();
> > ```
> >  The value of Symbols be changed by the line 
> > Symbols = E->getDynamicSymbolIterators();
> >  Symbols  maybe not the equal to  Obj.symbols()
> > 
> > so we can not use the 
> > if ( Obj.symbols().empty() && SymbolList.empty() && !Quiet) 
> Thanks, I see. I think I'd prefer it if you pulled the logic on whether or not an object has symbols into a separate function (maybe `hasSymbols`) which takes `Obj` and returns the value based on the appropriate set of symbols for whether `DynamicSyms` is specified or not. You probably would then benefit from a separate helper function used by this new function and `getSymbolNamesFromObject` that returns the dynamic symbols list (or an empty list if there was an error). Something like the following:
> 
> ```
> static bool hasSymbols(SymbolicFile &Obj) {
>   if (DynamicSyms)
>     return !getDynamicSyms(Obj).empty();
>   return !Obj.symbols().empty();
> }
> 
> static <insert return type here> getDynamicSyms(SymbolFile &Obj) {
>   const auto *E = dyn_cast<ELFObjectFileBase>(&Obj);
>   if (!E) {
>     error("File format has no dynamic symbol table", Obj.getFileName());
>     return {};
>   }
>   return E->getDynamicSymbolIterators();
> }
> ```
as your suggestion,  the function getDynamicSyms() will be call twice .
1. One is in the function
getSymbolNamesFromObject

2 . the other is in hasSymbols() which will be called in the function
dumpSymbolNamesFromObject() 

if there is error.

> if (!E) {
>     error("File format has no dynamic symbol table", Obj.getFileName());
>     return {};
>   }

the error info will be print out twice.

if I keep the code in the 
as 

```
static bool getSymbolNamesFromObject(SymbolicFile &Obj, bool &IsSymbolsEmpty) {
  auto Symbols = Obj.symbols();
  std::vector<VersionEntry> SymbolVersions;
  if (DynamicSyms) {
    const auto *E = dyn_cast<ELFObjectFileBase>(&Obj);
    if (!E) {
      error("File format has no dynamic symbol table", Obj.getFileName());
      return false;
    }
    Symbols = E->getDynamicSymbolIterators();
```

 there is no advantage of adding a two helper function.

I do not think adding a new parameter "bool &IsSymbolsEmpty" is a big problem.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120357/new/

https://reviews.llvm.org/D120357