[PATCH] D99055: [llvm-objcopy] Refactor CopyConfig structure.

Sun May 9 19:24:49 PDT 2021

dblaikie added inline comments.

================
Comment at: llvm/tools/llvm-objcopy/ConfigManager.h:26
+// format-specific options.
+struct ConfigManager : public MultiFormatConfig {
+  virtual ~ConfigManager() {}
----------------
avl wrote:
> dblaikie wrote:
> > avl wrote:
> > > dblaikie wrote:
> > > > avl wrote:
> > > > > dblaikie wrote:
> > > > > > Why is there an inheritance relationship here - if there's only one class that implements MultiFormatConfig - why isn't ConfigManager a non-virtual API?
> > > > > MultiFormatConfig interface is used when passed to routine working with archives:
> > > > > 
> > > > > ```
> > > > > Expected<std::vector<NewArchiveMember>>
> > > > > createNewArchiveMembers(const MultiFormatConfig &Config,
> > > > >                         const object::Archive &Ar);
> > > > > ```
> > > > > 
> > > > > It would be bad to pass whole ConfigManager here since ConfigManager has elements not used by createNewArchiveMembers.  Additionally ConfigManager is a llvm-objcopy specific thing. For some another tool(llvm-another-tool) there might be another implementation of config manager, but it needs to be presented as a universal interface for createNewArchiveMembers. MultiFormatConfig is a such interface.
> > > > It seems like an abstraction without a use case yet, so it should wait until there is a use case?
> > > > 
> > > > I understand some of this is refactoring to make this more amenable to be a general purpose library - but if all the uses of the archive support currently want a big multi-format config, it seems OK for that to be a single non-virtual thing at the moment. Then when someone has a need to build a format-specific tool with archive support (hopefully we don't do too much of that in LLVM - it's nice to be generic over all our supported object formats where possible) they could add this abstraction?
> > > > 
> > > > (what's the idea, that they could implement ConfigManager with a version that returns errors for everything other than the object format they support (ie: an ELF-only use of this would return a real ELF config, but getCOFF/MachO/WasmConfig would return Error about how those formats aren't supported)
> > > > It seems like an abstraction without a use case yet, so it should wait until there is a use case?
> > > > 
> > > > I understand some of this is refactoring to make this more amenable to be a general purpose library - but if all the uses of the archive support currently want a big multi-format config, it seems OK for that to be a single non-virtual thing at the moment. Then when someone has a need to build a format-specific tool with archive support (hopefully we don't do too much of that in LLVM - it's nice to be generic over all our supported object formats where possible) they could add this abstraction?
> > > > 
> > > > (what's the idea, that they could implement ConfigManager with a version that returns errors for everything other than the object format they support (ie: an ELF-only use of this would return a real ELF config, but getCOFF/MachO/WasmConfig would return Error about how those formats aren't supported)
> > > 
> > > My understanding is that there is a problem connected with the fact that there are options that could not be parsed properly without knowing the final format(the same options would be parsed differently for various formats). i.e. we could not parse&fill all options before we make a call to createNewArchiveMembers(). If such options are specified in command line and if archive contains files with different formats then we could not handle such option - it would fail for one or another format(https://reviews.llvm.org/D99055#2701025). Thus we assume that only one format might be requested.
> > > 
> > > The current solution is to have a callback from createNewArchiveMembers()(from the point where the format is known).
> > > i.e. when it is known that the final format is ELF, we call getELFConfig() which parses lazy options and checks for unsupported options(using code from ConfigManager). Such a call back allows separating createNewArchiveMembers from the code which parses the source options. i.e. when llvm-objcopy code would be divided into the common library part and specific llvm-objcopy part then createNewArchiveMembers() and MultiFormatConfig will go into the library part. ConfigManager will stay in llvm-objcopy specific part. Without such separation, it would be necessary to put ConfigManager into the library part. Putting specific part into the common part looks wrong.
> > I'm not sure how generic the common part is if it has to call for/have specifically enumerated options for each specific part, though? I'm not sure how much value that adds.
> > 
> > But also I realize this code review seems like it's gone on/been through so many different fairly expensive variations, that I don't want to come in half-informed and complicate this review further. I'd like to try to simplify/streamline the review a bit if I can help with tie-breakers or other things that might enable progress.
> > 
> > @jhenderson (@rupprecht?) - is there some way we could help settle this work a bit more than it seems to have been able to be so far? (happy to video chat, etc, if that might be helpful)
> >I'm not sure how generic the common part is if it has to call for/have specifically enumerated options for each specific part, though? I'm not sure how much value that adds.
> 
> The necessity to call for the specific part is driven by the fact that there exist options that could not be parsed without knowing the final format. That is the reason why parsing such options is delayed until the final format is known. 
> 
> Probably, instead of delayed parsing, we might detect the final format by pre-check. i.e. before starting to parse options we inspect the files and detect the format. Then we would be able to parse options properly and create a fully parsed CopyConfig from the scratch. No extra call for specific options would be required.
> 
> In that case CopyConfig might be single structure containing all options(without lazy initialized parts).
> The necessity to call for the specific part is driven by the fact that there exist options that could not be parsed without knowing the final format. That is the reason why parsing such options is delayed until the final format is known.

*nod* Though the current code doesn't have this abstraction and seems to get along OK, yeah? It does that by having some options be implicitly invalid until some operation is called that parses and populations those options (based on which file format is in use)?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99055/new/

https://reviews.llvm.org/D99055