[PATCH] D86539: [Debuginfo][llvm-dwarfutil] llvm-dwarfutil dsymutil-like tool for ELF.

Fri Apr 8 00:37:24 PDT 2022

jhenderson added inline comments.

================
Comment at: llvm/test/tools/llvm-dwarfutil/ELF/copy-itself.test:12-27
+# RUN: yaml2obj %s -o %t1
+# RUN: llvm-dwarfutil %t1 --separate-debug-file %t1
+# RUN: llvm-objdump --headers %t1 | FileCheck -check-prefix=CHECK-NON-DEBUG %s
+# RUN: llvm-dwarfdump -a %t1.debug | FileCheck -check-prefix=CHECK-DEBUG %s
+
+# RUN: yaml2obj %s -o %t1
+# RUN: llvm-dwarfutil --no-garbage-collection %t1 %t1
----------------
avl wrote:
> jhenderson wrote:
> > avl wrote:
> > > jhenderson wrote:
> > > > As stated in my previous comment, you don't need these test cases: they duplicate coverage provided elsewhere. There's no need to test that the copy-self behaviour works with every option, because the behaviour is unrelated to those options.
> > > It actually does not duplicate coverage. Using combination of these options(--garbage-collection, --no-garbage-collection, --separate-debug-file) leads to crossing different code paths:
> > > 
> > > llvm-dwwarfutil.cpp:219
> > > llvm-dwwarfutil.cpp:244
> > > llvm-dwwarfutil.cpp:331
> > > llvm-dwwarfutil.cpp:359
> > > llvm-dwwarfutil.cpp:395
> > > 
> > > Thus it looks good if all of them would be covered.
> > I'm not sure what you mean by "crossing different code paths" here exactly, but `writeToOutput`'s behaviour is already tested, and a single test case is sufficient to show that writing to the input file via this method is sufficient, assuming that code calling that function doesn't do anything odd like open the input file again etc. Given that it doesn't currently, a test isn't needed until someone adds that functionality. Your non self-copy tests are sufficient to show that output is written correctly when those options are used.
> If the implementation would be changed later - and instead of writeToOutput there would be used another code in some places - then we will not have a check that proves everything is working as expected(if we will remove these tests).  
And at that point, it would be up to the developer to consider the new behaviour and how it needs testing though. Regardless, that code will still be tested by basic testing for each of the different options - the combination of them doesn't change the fact that it gets called ultimately. Combining the options doesn't change the code path of the area under test to something that can't be hit by it. If we followed your logic, we'd write tests for every possible combination of tool options for every tool, which is clearly impractical.

================
Comment at: llvm/test/tools/llvm-dwarfutil/ELF/error-corrupted-input-file.test:6
+
+# CHECK: archive.a: error: corrupted input file
----------------
avl wrote:
> jhenderson wrote:
> > The format would usually be `error: 'archive.a': corrupted input file`. See `createFileError` (which you should be using).
> > 
> > That being said, how is this input file "corrupt"? The file itself is a perfectly valid thin archive file. Shouldn't this just be something like "unsupported file format" or similar?
> The same error might occur if the source file is corrupted. Possible good message would be :
> 
> error: 'archive.a': unsupported or corrupted input file
> 
> ?
This error can only be hit if the object is a file type supported by `Binary` that is not an object type. Unless the input file is somehow corrupted to make type identification pass but for the wrong kind of file, then the code wouldn't get to this error - it would fail during `Binary` construction, which is an earlier error. If it does get corrupted in such a way as to look like another object file kind, I don't think it's unreasonable to assume it is that object file type and state simply "unsupported" (otherwise we have to state "corrupt or unsupported" for every instance where we currently just say "unsupported"). Is there another case you had in mind for "corrupt"?

================
Comment at: llvm/test/tools/llvm-dwarfutil/ELF/verbose.test:10
+## Check warning displayed in multi-thread mode.
+# RUN: llvm-dwarfutil %t.o %t1 --verbose 2>&1 | FileCheck %s --check-prefix=CHECK-THREAD-WARNING
+
----------------
avl wrote:
> jhenderson wrote:
> > I'm not convinced you should have a warning here, because otherwise the --verbose option will always produce a warning unless an explicit --num-threads option is specified. I'm not sure though. Perhaps documenting it in the CommandGuide and help text would be sufficient?
> If that warning looks annoying - yes we can avoid it and document the behavior. To check my understanding - the warning should not be shown in all cases: in default "llvm-dwarfutil %t.o %t1 --verbose"  and in the explicit : "llvm-dwarfutil %t.o %t1 --verbose --num-threads 10", right ?
I think default behaviour (no --num-threads specified) should not produce a warning. If --num-threads is specified explicitly, I think a warning makes sense.

================
Comment at: llvm/test/tools/llvm-dwarfutil/error-invalid-format.test:8
+
+# CHECK: error: The file was not recognized as a valid object file
----------------
avl wrote:
> jhenderson wrote:
> > jhenderson wrote:
> > > It would be nice if this had the file's name in the message.
> > Not addressed?
> it looks addressed. error-invalid-format.test - is the file name. It probably should be in quotas(using createFileError) like you said in other comment.
Sorry, missed that somehow, but see my comment elsewhere about filenames appearing after "error" (use `createFileError`).

================
Comment at: llvm/test/tools/llvm-dwarfutil/error-no-input-file.test:6
+
+# CHECK: non-existed: error: No such file or directory
----------------
avl wrote:
> jhenderson wrote:
> > jhenderson wrote:
> > > It would be nice if this had the file's name in the message.
> > Not addressed.
> > 
> > Also, this needs to use a lit substitution for the message format. See the %errc* substitutions.
> not-existed - is the file name.
Same comment as above - use `createFileError` (as well as the `%errc*` substitution).

================
Comment at: llvm/tools/llvm-dwarfutil/llvm-dwarfutil.cpp:174
+    if (ObjectFile *Obj = static_cast<ObjectFile *>(BinOrErr->getBinary())) {
+      verbose("Verifying DWARF...", Opts.Verbose);
+      std::unique_ptr<DWARFContext> DICtx = DWARFContext::create(*Obj);
----------------
avl wrote:
> jhenderson wrote:
> > avl wrote:
> > > jhenderson wrote:
> > > > Test case?
> > > It is quite tricky to create a test case here. Changing this to asserts does not look right. I think it is safe to leave it as is.
> > Looks like you made a test case after all :-)
> no. there is no test case for it still.
`verbose.test` line 24 checks "Verifying DWARF..." though?

================
Comment at: llvm/tools/llvm-dwarfutil/llvm-dwarfutil.cpp:332
+  if (Error Err = writeToOutput(
+          Config.Common.OutputFilename, [&](raw_ostream &OutFile) -> Error {
+            raw_crc_ostream CRCBuffer(OutFile);
----------------
avl wrote:
> jhenderson wrote:
> > avl wrote:
> > > jhenderson wrote:
> > > > Is the trailing return type needed?
> > > I think it is always useful to type it explicitly. It clearly shows return type. Otherwise you need to compile the lambda by itself.  If it is against of guideline - then would remove it.
> > > Otherwise you need to compile the lambda by itself
> > 
> > I don't understand this comment.
> > 
> > > If it is against of guideline - then would remove it.
> > 
> > It's because it's superfluous noise - I'm not sure there's a specific coding standard on it though. Since this lambda is being passed directly into a function, you don't need to know what its return type is due to how you use it later. Within the function, it's clear that the return type is `Error`, so it doesn't really help with readability of the function either, in my opinion. Consequently, the extra code means it's harder for me to read everything.
> > > Otherwise you need to compile the lambda by itself
> > 
> > I don't understand this comment.
> 
> I mean that the type of returning value is not clear while someone inspecting the code. Someone need to check types of expressions in the "return" clauses, type of argument where lambda is used, etc...
> 
> 
> > 
> > > If it is against of guideline - then would remove it.
> > 
> > It's because it's superfluous noise - I'm not sure there's a specific coding standard on it though. Since this lambda is being passed directly into a function, you don't need to know what its return type is due to how you use it later. Within the function, it's clear that the return type is `Error`, so it doesn't really help with readability of the function either, in my opinion. Consequently, the extra code means it's harder for me to read everything.
> 
> My understanding is opposite. It is not really clear whether return type is Error, or instead it is Expected<something> or something else.. If returning type is hidden it requires less letters but more actions to understand.
> 
> so I think the benefit of using explicit return value is more than created noise.
> 
> 
I've created a Discourse thread (https://discourse.llvm.org/t/coding-standard-for-lambda-trailing-return-types/61569) to discuss the general principle, because I don't agree with you, and I think it's clear we need a general standard on this topic.

> I mean that the type of returning value is not clear while someone inspecting the code. Someone need to check types of expressions in the "return" clauses, type of argument where lambda is used, etc...

In this particular context, when does somebody need to know the return type? How do `return Err;` and `return Error::success();` not express the return type clearly?

> My understanding is opposite. It is not really clear whether return type is Error, or instead it is Expected<something> or something else.

The use of `Error::success()` rules out `Expected<T>` as a return type. Instead it would return a concrete value. Again though, my question is, why do you need to know the return type in this context?

I argue the following for this specific case:
1. The lambda is being passed directly into a function so we don't have to worry about capturing it in a variable and ensuring the variable's type matches that of the lambda return value.
2. The function is small. If someone needs to modify it, they'll need to understand what the function does anyway, and it's obvious from what is being returned: the use of `Err` as a variable name, immediately after assigning it to an `Error` variable, combined with `Error::success()`.
3. The compiler will tell you if you get something wrong when modifying the code (either the function signature changing or somebody modifying the lambda). If developing in an IDE, this error may even appear as you are typing, without the need to explicitly write the compiler.

================
Comment at: llvm/tools/llvm-dwarfutil/llvm-dwarfutil.cpp:416-417
+    } else
+      return createStringError(std::errc::invalid_argument,
+                               "possible broken debug info");
+
----------------
avl wrote:
> jhenderson wrote:
> > jhenderson wrote:
> > > Test case?
> > Not addressed?
> yep, it is not addressed yet. Returning type of linkDebugInfo is kind of artificial at the moment. I will change it in followup review. Currently it is hard to create a testcase when it returns false.
Okay, I take it this isn't the code path that is hit if debug data in the input is malformed?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86539/new/

https://reviews.llvm.org/D86539