[PATCH] D86539: [Debuginfo][llvm-dwarfutil] llvm-dwarfutil dsymutil-like tool for ELF.

Fri Apr 8 03:22:19 PDT 2022

avl added inline comments.

================
Comment at: llvm/test/tools/llvm-dwarfutil/ELF/copy-itself.test:12-27
+# RUN: yaml2obj %s -o %t1
+# RUN: llvm-dwarfutil %t1 --separate-debug-file %t1
+# RUN: llvm-objdump --headers %t1 | FileCheck -check-prefix=CHECK-NON-DEBUG %s
+# RUN: llvm-dwarfdump -a %t1.debug | FileCheck -check-prefix=CHECK-DEBUG %s
+
+# RUN: yaml2obj %s -o %t1
+# RUN: llvm-dwarfutil --no-garbage-collection %t1 %t1
----------------
jhenderson wrote:
> avl wrote:
> > jhenderson wrote:
> > > avl wrote:
> > > > jhenderson wrote:
> > > > > As stated in my previous comment, you don't need these test cases: they duplicate coverage provided elsewhere. There's no need to test that the copy-self behaviour works with every option, because the behaviour is unrelated to those options.
> > > > It actually does not duplicate coverage. Using combination of these options(--garbage-collection, --no-garbage-collection, --separate-debug-file) leads to crossing different code paths:
> > > > 
> > > > llvm-dwwarfutil.cpp:219
> > > > llvm-dwwarfutil.cpp:244
> > > > llvm-dwwarfutil.cpp:331
> > > > llvm-dwwarfutil.cpp:359
> > > > llvm-dwwarfutil.cpp:395
> > > > 
> > > > Thus it looks good if all of them would be covered.
> > > I'm not sure what you mean by "crossing different code paths" here exactly, but `writeToOutput`'s behaviour is already tested, and a single test case is sufficient to show that writing to the input file via this method is sufficient, assuming that code calling that function doesn't do anything odd like open the input file again etc. Given that it doesn't currently, a test isn't needed until someone adds that functionality. Your non self-copy tests are sufficient to show that output is written correctly when those options are used.
> > If the implementation would be changed later - and instead of writeToOutput there would be used another code in some places - then we will not have a check that proves everything is working as expected(if we will remove these tests).  
> And at that point, it would be up to the developer to consider the new behaviour and how it needs testing though. Regardless, that code will still be tested by basic testing for each of the different options - the combination of them doesn't change the fact that it gets called ultimately. Combining the options doesn't change the code path of the area under test to something that can't be hit by it. If we followed your logic, we'd write tests for every possible combination of tool options for every tool, which is clearly impractical.
not every possible but every relevant. We have several paths which could lead to writing to the file - it would be good if all of these paths would be covered by test. 

But if we want to reduce that level of testing - I will left only one case.

================
Comment at: llvm/test/tools/llvm-dwarfutil/ELF/error-corrupted-input-file.test:6
+
+# CHECK: archive.a: error: corrupted input file
----------------
jhenderson wrote:
> avl wrote:
> > jhenderson wrote:
> > > The format would usually be `error: 'archive.a': corrupted input file`. See `createFileError` (which you should be using).
> > > 
> > > That being said, how is this input file "corrupt"? The file itself is a perfectly valid thin archive file. Shouldn't this just be something like "unsupported file format" or similar?
> > The same error might occur if the source file is corrupted. Possible good message would be :
> > 
> > error: 'archive.a': unsupported or corrupted input file
> > 
> > ?
> This error can only be hit if the object is a file type supported by `Binary` that is not an object type. Unless the input file is somehow corrupted to make type identification pass but for the wrong kind of file, then the code wouldn't get to this error - it would fail during `Binary` construction, which is an earlier error. If it does get corrupted in such a way as to look like another object file kind, I don't think it's unreasonable to assume it is that object file type and state simply "unsupported" (otherwise we have to state "corrupt or unsupported" for every instance where we currently just say "unsupported"). Is there another case you had in mind for "corrupt"?
Ok, would use "unsupported file format".

================
Comment at: llvm/tools/llvm-dwarfutil/llvm-dwarfutil.cpp:174
+    if (ObjectFile *Obj = static_cast<ObjectFile *>(BinOrErr->getBinary())) {
+      verbose("Verifying DWARF...", Opts.Verbose);
+      std::unique_ptr<DWARFContext> DICtx = DWARFContext::create(*Obj);
----------------
jhenderson wrote:
> avl wrote:
> > jhenderson wrote:
> > > avl wrote:
> > > > jhenderson wrote:
> > > > > Test case?
> > > > It is quite tricky to create a test case here. Changing this to asserts does not look right. I think it is safe to leave it as is.
> > > Looks like you made a test case after all :-)
> > no. there is no test case for it still.
> `verbose.test` line 24 checks "Verifying DWARF..." though?
my comment was about 181 line :

return createError("unsupported file: '" + FileName + "'");

this line is not covered by test and it is tricky to do it.

================
Comment at: llvm/tools/llvm-dwarfutil/llvm-dwarfutil.cpp:332
+  if (Error Err = writeToOutput(
+          Config.Common.OutputFilename, [&](raw_ostream &OutFile) -> Error {
+            raw_crc_ostream CRCBuffer(OutFile);
----------------
jhenderson wrote:
> avl wrote:
> > jhenderson wrote:
> > > avl wrote:
> > > > jhenderson wrote:
> > > > > Is the trailing return type needed?
> > > > I think it is always useful to type it explicitly. It clearly shows return type. Otherwise you need to compile the lambda by itself.  If it is against of guideline - then would remove it.
> > > > Otherwise you need to compile the lambda by itself
> > > 
> > > I don't understand this comment.
> > > 
> > > > If it is against of guideline - then would remove it.
> > > 
> > > It's because it's superfluous noise - I'm not sure there's a specific coding standard on it though. Since this lambda is being passed directly into a function, you don't need to know what its return type is due to how you use it later. Within the function, it's clear that the return type is `Error`, so it doesn't really help with readability of the function either, in my opinion. Consequently, the extra code means it's harder for me to read everything.
> > > > Otherwise you need to compile the lambda by itself
> > > 
> > > I don't understand this comment.
> > 
> > I mean that the type of returning value is not clear while someone inspecting the code. Someone need to check types of expressions in the "return" clauses, type of argument where lambda is used, etc...
> > 
> > 
> > > 
> > > > If it is against of guideline - then would remove it.
> > > 
> > > It's because it's superfluous noise - I'm not sure there's a specific coding standard on it though. Since this lambda is being passed directly into a function, you don't need to know what its return type is due to how you use it later. Within the function, it's clear that the return type is `Error`, so it doesn't really help with readability of the function either, in my opinion. Consequently, the extra code means it's harder for me to read everything.
> > 
> > My understanding is opposite. It is not really clear whether return type is Error, or instead it is Expected<something> or something else.. If returning type is hidden it requires less letters but more actions to understand.
> > 
> > so I think the benefit of using explicit return value is more than created noise.
> > 
> > 
> I've created a Discourse thread (https://discourse.llvm.org/t/coding-standard-for-lambda-trailing-return-types/61569) to discuss the general principle, because I don't agree with you, and I think it's clear we need a general standard on this topic.
> 
> > I mean that the type of returning value is not clear while someone inspecting the code. Someone need to check types of expressions in the "return" clauses, type of argument where lambda is used, etc...
> 
> In this particular context, when does somebody need to know the return type? How do `return Err;` and `return Error::success();` not express the return type clearly?
> 
> > My understanding is opposite. It is not really clear whether return type is Error, or instead it is Expected<something> or something else.
> 
> The use of `Error::success()` rules out `Expected<T>` as a return type. Instead it would return a concrete value. Again though, my question is, why do you need to know the return type in this context?
> 
> I argue the following for this specific case:
> 1. The lambda is being passed directly into a function so we don't have to worry about capturing it in a variable and ensuring the variable's type matches that of the lambda return value.
> 2. The function is small. If someone needs to modify it, they'll need to understand what the function does anyway, and it's obvious from what is being returned: the use of `Err` as a variable name, immediately after assigning it to an `Error` variable, combined with `Error::success()`.
> 3. The compiler will tell you if you get something wrong when modifying the code (either the function signature changing or somebody modifying the lambda). If developing in an IDE, this error may even appear as you are typing, without the need to explicitly write the compiler.
> I've created a Discourse thread (https://discourse.llvm.org/t/coding-standard-for-lambda-trailing-return-types/61569) to discuss the general principle, because I don't agree with you, and I think it's clear we need a general standard on this topic.
> 
Thanks. I will join to this thread.

> > I mean that the type of returning value is not clear while someone inspecting the code. Someone need to check types of expressions in the "return" clauses, type of argument where lambda is used, etc...
> 
> In this particular context, when does somebody need to know the return type? How do `return Err;` and `return Error::success();` not express the return type clearly?
> 

In this particular context it is more or less obvious. But in general it is very subjective - whether it is obvious or not. C++ is very complex language. If you see return true - it does not mean return value is bool. if you see return foo() - it is not seen which type it is. I think it is easier to just have a return type than think whether it is obvious or not.

> > My understanding is opposite. It is not really clear whether return type is Error, or instead it is Expected<something> or something else.
> 
> The use of `Error::success()` rules out `Expected<T>` as a return type. Instead it would return a concrete value. Again though, my question is, why do you need to know the return type in this context?

In this particular context it is more or less obvious(Though it still requires to check several places(return Err, Error Err, return Error::success()) instead of one(-> Error)). The thing which does not worth a time, from my opinion -> thinking whether it is enough obvious or not.

> 
> I argue the following for this specific case:
> 1. The lambda is being passed directly into a function so we don't have to worry about capturing it in a variable and ensuring the variable's type matches that of the lambda return value.
> 2. The function is small. If someone needs to modify it, they'll need to understand what the function does anyway, and it's obvious from what is being returned: the use of `Err` as a variable name, immediately after assigning it to an `Error` variable, combined with `Error::success()`.
> 3. The compiler will tell you if you get something wrong when modifying the code (either the function signature changing or somebody modifying the lambda). If developing in an IDE, this error may even appear as you are typing, without the need to explicitly write the compiler.

My main point is that it would be easier to not waste a time on thinking whether each concrete case is obvious enough or not. 

================
Comment at: llvm/tools/llvm-dwarfutil/llvm-dwarfutil.cpp:416-417
+    } else
+      return createStringError(std::errc::invalid_argument,
+                               "possible broken debug info");
+
----------------
jhenderson wrote:
> avl wrote:
> > jhenderson wrote:
> > > jhenderson wrote:
> > > > Test case?
> > > Not addressed?
> > yep, it is not addressed yet. Returning type of linkDebugInfo is kind of artificial at the moment. I will change it in followup review. Currently it is hard to create a testcase when it returns false.
> Okay, I take it this isn't the code path that is hit if debug data in the input is malformed?
right. malformed input would not lead to return false here.(Though it looks logical if it is)

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86539/new/

https://reviews.llvm.org/D86539