[PATCH] D112735: export unique symbol list with llvm-nm new option "--export-symbols"

Digger Lin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Feb 8 12:42:28 PST 2022


DiggerLin added a comment.

In D112735#3303843 <https://reviews.llvm.org/D112735#3303843>, @jhenderson wrote:

> In D112735#3297519 <https://reviews.llvm.org/D112735#3297519>, @DiggerLin wrote:
>
>>   I created a refactor patch for above and rebase current patch based on it.
>
> Thanks.
>
>>>>> I think you should also do a refactor whereby SymbolList is no longer a global variable - having it as a global variable is confusing, at best, and at worst risks cases where it is still populated when it shouldn't be.
>>>>
>>>>   for the export list, I think we need a global variable , otherwise we need a variable in main() , and then pass into each function.  The llvm-nm is c style code, I do not think we can do all the refactor in this patch.   if you have have good idea, please let me know, I will create  a separate refactor patch after current patch land.
>>>
>>> My suggested design I already provided in my previous comment actively needs to avoid the global variable - the proposed `getSymbolsFromObject` function would return this list for use by the respective calling functions. Global variables make code re-use harder, and lead to temporal dependencies, both of which this patch is inflicted with, in part due to the global `SymbolList`. Doing the refactor up-front would significantly improve this patch.
>>
>>   Since we have not reach an agreement on the getSymbolsFromObject() , I am prefer to  do "removing the global variable SymbolList" in a later separate refactor patch. Our team is been blocking on the "export symbol list", I understand your concern about the quality of the code,  but I do not think "removing a global variable SymbolList" is in the scope of the patch. 
>
>
>
>>>> 2. printExportSymbolList loops over all inputs and calls getSymbolsFromObject , I think it need to share the code the dumpSymbolNamesFromFile. and we need to added a additional  if(ExportSymbols) in the dumpSymbolNamesFromFile to go to different function .
>>>
>>> I'm not sure I follow. In my proposed design, `dumpSymbolNamesFromFile` should never be touched by the export symbols code route, only `getSymbolsFromObject`.
>>
>>   when printExportSymbolList loops over all inputs ,  we still need to get object files from archive or from  MachOUniversalBinary, TapiUniversal,  we need the functionality  in  "dumpSymbolNamesFromFile" ,  so we still need the dumpSymbolNamesFromFile() for printExportSymbolList().
>
> If we need the object fetching code from `dumpSymbolNamesFromFile` then just pull that code into a separate function too... Here's what I'm thinking:
>
>   // Class for storing the binary and passing around associated properties, in case it's an object.
>   struct NMObject {
>     std::unique_ptr<Binary> Bin; // Possibly any other members that might be necessary to store archive members.
>     StringRef ArchiveName;
>     StringRef ArchitectureName;
>   };
>   
>   static std::vector<NMSymbol> getSymbolsFromDLInfoMachO(MachOObjectFile &MachO) {
>     std::vector<NMSymbol> Symbols;
>     // As original implementation of dumpSymbolsFromDLInfoMachO, but replace all references to SymbolList with Symbols.
>     return Symbols;
>   }
>   
>   static std::vector<NMSymbol> getSymbolsFromObject(NMObject &Obj) {
>     std::vector<NMSymbol> Symbols;
>     // Parts of original dumpSymbolNamesFromObject that get the symbols that are to be printed.
>     // Filtering will be done later, rather than here, i.e. don't add the exportSymbolsForXCOFF call here.
>     return Symbols;
>   }
>   
>   static void dumpSymbolNamesFromObject(NMObject &Obj) {
>     std::vector<NMSymbol> Symbols = getSymbolsFromObject(Obj);
>     Symbols = sortSymbolList(std::move(Symbols)); // Or sort in place
>     printSymbolList(Symbols, Obj, printName);
>   }
>   
>   static std::vector<NMObject> getObjectsFromFile(StringRef InputFile) {
>     std::vector<NMObject> Objects;
>     /*
>       Code from dumpSymbolNamesFromFile which retrieves the objects (and archive properties, if appropriate) inside a binary (may be a single object, or many).
>       Emplace an NMObject in the vector for each such constructed object, instead of calling dumpSymbolNamesFromObject.
>     */
>     return Objects;
>   }
>   
>   static void dumpSymbolNamesFromFile(StringRef InputFile) {
>     std::vector<NMObject> Objects = getObjectsFromFile(InputFile);
>     llvm::for_each(Objects, dumpSymbolNamesFromObject);
>   }
>   
>   std::vector<NMSymbol> getExportSymbols(ArrayRef<NMSymbol> Symbols, NMObject &Obj) {
>     std::vector<NMSymbol> ExportSymbols;
>     if (auto *XCOFF = dyn_cast<XCOFFObjectFile>(Obj.Bin)) {
>       // Do what's necessary to find the symbols appropriate for XCOFF export symbols.
>     }
>     // TODO: Add implementations for other objects here. Possibly warn otherwise.
>     return ExportSymbols;
>   }
>   
>   static void dumpExportSymbolList(ArrayRef<StringRef> InputFilenames) {
>     std::vector<NMSymbol> Symbols;
>     for(StringRef InputFile : InputFilenames) {
>       std::vector<NMObject> FileObjects = getObjectsFromFile(InputFile);
>       for(NMObject &Object : FileObjects) {
>         std::vector<NMSymbol> ObjSymbols = getSymbolsFromObject(InputFile);
>         ObjSymbols = getExportSymbols(ObjSymbols, Object);
>         // Add ObjSymbols to the end of Symbols.
>       }
>     }
>     Symbols = sortSymbolList(Symbols); // Or sort in place
>     SymbolList.erase(std::unique(Symbols.begin(), Symbols.end()), Symbols.end());
>     printExportSymbols(Symbols);
>   }
>   
>   int main() {
>     /* ... */
>     if(ExportSymbols)
>       dumpExportSymbolList(InputFilenames);
>     else
>       llvm::for_each(InputFilenames, dumpSymbolNamesFromFile);
>   }
>
> I've not tested this structure out, but I'm pretty confident that it, likely with some small tweaks is both a) not a massive departure from the current structure (meaning we don't need to rewrite large amounts of code), and b) will resolve the structural issues with this patch.

thanks for your sample code, I agree with you, the patch is quite large now. if I implement as your suggestion in the patch , I believe that we will need another refactor patch which split the dumpSymbolNamesFromFile function into several small function before the patch. Our patch https://reviews.llvm.org/D119147 is depend on the patch. My suggestion is that I will do refactor as your suggestion after the current patch land and at least let the change code in XCOFFObjectFile.cpp , test case, llvm-nm.rst etc landed, and we can focus on refactoring the llvm-nm.cpp.

if you strong insisted that we should do as your suggestion in the patch first, I can do it and postpones our schedule.   @hubert.reinterpretcast .


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D112735/new/

https://reviews.llvm.org/D112735



More information about the llvm-commits mailing list