[PATCH] D86539: [Debuginfo][llvm-dwarfutil] llvm-dwarfutil dsymutil-like tool for ELF.

Thu Apr 7 02:37:42 PDT 2022

jhenderson added inline comments.

================
Comment at: llvm/lib/DWARFLinker/DWARFLinker.cpp:504-505
+  if (Unit.overlapsWithFunctionsRanges(*LowPc, *HighPc)) {
+    reportWarning(formatv("Overlapped address range [{0:X}, {1:X}]. Range will "
+                          "be discarded.\n",
+                          *LowPc, *HighPc),
----------------
avl wrote:
> jhenderson wrote:
> > Noting that this warning message (and indeed others in this file) don't conform to the current LLVM style guide. It also seems like `\n` should be part of the `reportWarning` function itself. I'd certainly expect it to be.
> right. I used the same style like in the rest of that file. It does not look good to have a mix of styles. I think it should be separate commit for DWARFLinker library changing style of error messages.
Agreed

================
Comment at: llvm/test/tools/llvm-dwarfutil/ELF/Inputs/archive.a:1
+!<thin>
+//                                              20        `
----------------
Why can't this file be generated on the fly? It's also a bit misleading calling it `archive.a` when it's a thin archive.

================
Comment at: llvm/test/tools/llvm-dwarfutil/ELF/copy-itself.test:12-27
+# RUN: yaml2obj %s -o %t1
+# RUN: llvm-dwarfutil %t1 --separate-debug-file %t1
+# RUN: llvm-objdump --headers %t1 | FileCheck -check-prefix=CHECK-NON-DEBUG %s
+# RUN: llvm-dwarfdump -a %t1.debug | FileCheck -check-prefix=CHECK-DEBUG %s
+
+# RUN: yaml2obj %s -o %t1
+# RUN: llvm-dwarfutil --no-garbage-collection %t1 %t1
----------------
avl wrote:
> jhenderson wrote:
> > As stated in my previous comment, you don't need these test cases: they duplicate coverage provided elsewhere. There's no need to test that the copy-self behaviour works with every option, because the behaviour is unrelated to those options.
> It actually does not duplicate coverage. Using combination of these options(--garbage-collection, --no-garbage-collection, --separate-debug-file) leads to crossing different code paths:
> 
> llvm-dwwarfutil.cpp:219
> llvm-dwwarfutil.cpp:244
> llvm-dwwarfutil.cpp:331
> llvm-dwwarfutil.cpp:359
> llvm-dwwarfutil.cpp:395
> 
> Thus it looks good if all of them would be covered.
I'm not sure what you mean by "crossing different code paths" here exactly, but `writeToOutput`'s behaviour is already tested, and a single test case is sufficient to show that writing to the input file via this method is sufficient, assuming that code calling that function doesn't do anything odd like open the input file again etc. Given that it doesn't currently, a test isn't needed until someone adds that functionality. Your non self-copy tests are sufficient to show that output is written correctly when those options are used.

================
Comment at: llvm/test/tools/llvm-dwarfutil/ELF/copy.test:11
+
+## Check that the first copy matches with the second.
+# RUN: llvm-dwarfutil --no-garbage-collection %t1 %t2
----------------
Rewording as suggested changes the emphasis slightly to be more natural, namely that the second copy is the thing that is uncertain, and the first is the "correct" version.

================
Comment at: llvm/test/tools/llvm-dwarfutil/ELF/copy.test:16
+
+## Check that debug info is stored in separate file.
+# RUN: llvm-dwarfutil --no-garbage-collection %t.o --separate-debug-file %t3
----------------
Shouldn't these cases be in a dedicated test file for `--separate-debug-file`?

================
Comment at: llvm/test/tools/llvm-dwarfutil/ELF/copy.test:23
+
+## Check that debug info is stored in separate file (last --separate-debug-file wins).
+# RUN: llvm-dwarfutil %t.o --no-garbage-collection --no-separate-debug-file --separate-debug-file %t3
----------------
We already have a test case for the explicit option. The interesting bit here is the last-one-wins behaviour, so just focus on that in this comment.

================
Comment at: llvm/test/tools/llvm-dwarfutil/ELF/copy.test:28-29
+
+## Check that the resulting file contains debug info from source file.
+## (last --no-separate-debug-file wins)
+# RUN: llvm-dwarfutil %t.o --no-garbage-collection --separate-debug-file --no-separate-debug-file %t5
----------------

================
Comment at: llvm/test/tools/llvm-dwarfutil/ELF/copy.test:31
+# RUN: llvm-dwarfutil %t.o --no-garbage-collection --separate-debug-file --no-separate-debug-file %t5
+# RUN: diff %t1 %t5
+
----------------
Use `cmp` rather than `diff` for binary files. Applies throughout.

================
Comment at: llvm/test/tools/llvm-dwarfutil/ELF/copy.test:17-18
+
+# RUN: llvm-dwarfutil %t.o --separate-debug-file --no-separate-debug-file %t1
+# RUN: llvm-dwarfdump -a %t1 | FileCheck -check-prefix=CHECK-DEBUG %s
+
----------------
avl wrote:
> jhenderson wrote:
> > This doesn't show that `--no-separate-debug-file` is the option being processed... If you removed that option the test would still pass!
> It looks like it does not pass:
> 
> 
> ```
> diff --git a/llvm/test/tools/llvm-dwarfutil/ELF/copy.test b/llvm/test/tools/llvm-dwarfutil/ELF/copy.test
> index 9ca78308f5da..9e9eb8b84750 100644
> --- a/llvm/test/tools/llvm-dwarfutil/ELF/copy.test
> +++ b/llvm/test/tools/llvm-dwarfutil/ELF/copy.test
> @@ -14,7 +14,7 @@
>  # RUN: llvm-objdump --headers %t1 | FileCheck -check-prefix=CHECK-NON-DEBUG %s
>  # RUN: llvm-dwarfdump -a %t1.debug | FileCheck -check-prefix=CHECK-DEBUG %s
>  
> -# RUN: llvm-dwarfutil %t.o --separate-debug-file --no-separate-debug-file %t1
> +# RUN: llvm-dwarfutil %t.o --separate-debug-file %t1
>  # RUN: llvm-dwarfdump -a %t1 | FileCheck -check-prefix=CHECK-DEBUG %s
>  
>  # RUN: llvm-dwarfutil --no-garbage-collection %t.o %t1
> ```
> 
> 
> ```
> llvm-lit  copy.test 
> -- Testing: 1 tests, 1 workers --
> FAIL: LLVM :: tools/llvm-dwarfutil/ELF/copy.test (1 of 1)
> ```
> 
> 
I somehow completely misread the test (mistaking `--no-garbage-collection` for `--no-separate-debug-file`).

================
Comment at: llvm/test/tools/llvm-dwarfutil/ELF/error-corrupted-input-file.test:6
+
+# CHECK: archive.a: error: corrupted input file
----------------
The format would usually be `error: 'archive.a': corrupted input file`. See `createFileError` (which you should be using).

That being said, how is this input file "corrupt"? The file itself is a perfectly valid thin archive file. Shouldn't this just be something like "unsupported file format" or similar?

================
Comment at: llvm/test/tools/llvm-dwarfutil/ELF/error-separate-file-stdout.test:7
+
+# CHECK: error: writing to stdout. --separate-debug-file should be disabled
+
----------------
`<stdin>` is typically used for error messages where the file is read from stdin (please add code and a test case that does this). It makes sense to use the same style for stdout (i.e. `<stdout>`) where that's relevant, but actually here, I think it would be clearer to say "unable to write to stdout when --separate-debug-file specified" or something equivalent.

================
Comment at: llvm/test/tools/llvm-dwarfutil/ELF/gc-class.test:9
+
+## Check implicit --garbage-collection and --odr-deduplication.
+# RUN: llvm-dwarfutil %t.o %t1
----------------
I don't think you need to reference "--garbage-collection" in all of these comments in this file, since it's not all that relevant to the test case.

================
Comment at: llvm/test/tools/llvm-dwarfutil/ELF/gc-class.test:13
+
+## Check implicit --garbage-collection and explicit alias for --odr-deduplication.
+# RUN: llvm-dwarfutil --odr %t.o %t2
----------------
Could simplify to "Check --odr alias." No need to say "explicit" here since the use of an alias implies it's explicit anyway.

================
Comment at: llvm/test/tools/llvm-dwarfutil/ELF/gc-class.test:17-19
+## Check implicit --garbage-collection and explicit alias for --no-odr (last option wins).
+# RUN: llvm-dwarfutil --odr --no-odr %t.o %t3
+# RUN: llvm-dwarfdump -a %t3 | FileCheck %s --check-prefixes=CHECK,CHECK-NOODR
----------------
I don't think you need to check the "last one wins" behaviour of aliases. It should be sufficient to check that for the primary versions of the options (and I'd do that before the aliases) and then show that aliases produce identical output to their main version.

================
Comment at: llvm/test/tools/llvm-dwarfutil/ELF/gc-class.test:21
+
+## Check implicit --garbage-collection and explicit --no-odr-deduplication (last option wins).
+# RUN: llvm-dwarfutil --odr-deduplication --no-odr-deduplication %t.o %t4
----------------

================
Comment at: llvm/test/tools/llvm-dwarfutil/ELF/gc-class.test:29
+
+## Check implicit --garbage-collection and explicit --odr-deduplication (last option wins).
+# RUN: llvm-dwarfutil --no-odr-deduplication --odr-deduplication %t.o %t6
----------------

================
Comment at: llvm/test/tools/llvm-dwarfutil/ELF/verbose.test:10
+## Check warning displayed in multi-thread mode.
+# RUN: llvm-dwarfutil %t.o %t1 --verbose 2>&1 | FileCheck %s --check-prefix=CHECK-THREAD-WARNING
+
----------------
I'm not convinced you should have a warning here, because otherwise the --verbose option will always produce a warning unless an explicit --num-threads option is specified. I'm not sure though. Perhaps documenting it in the CommandGuide and help text would be sufficient?

================
Comment at: llvm/test/tools/llvm-dwarfutil/error-invalid-format.test:8
+
+# CHECK: error: The file was not recognized as a valid object file
----------------
jhenderson wrote:
> It would be nice if this had the file's name in the message.
Not addressed?

================
Comment at: llvm/test/tools/llvm-dwarfutil/error-no-input-file.test:6
+
+# CHECK: non-existed: error: No such file or directory
----------------
jhenderson wrote:
> It would be nice if this had the file's name in the message.
Not addressed.

Also, this needs to use a lit substitution for the message format. See the %errc* substitutions.

================
Comment at: llvm/tools/llvm-dwarfutil/llvm-dwarfutil.cpp:174
+    if (ObjectFile *Obj = static_cast<ObjectFile *>(BinOrErr->getBinary())) {
+      verbose("Verifying DWARF...", Opts.Verbose);
+      std::unique_ptr<DWARFContext> DICtx = DWARFContext::create(*Obj);
----------------
avl wrote:
> jhenderson wrote:
> > Test case?
> It is quite tricky to create a test case here. Changing this to asserts does not look right. I think it is safe to leave it as is.
Looks like you made a test case after all :-)

================
Comment at: llvm/tools/llvm-dwarfutil/llvm-dwarfutil.cpp:271
+
+using DebugInfoBits = SmallString<100000>;
+
----------------
avl wrote:
> jhenderson wrote:
> > Is `SmallString` really the right answer here for such a large base size?
> raw_svector_ostream uses SmallString. If that constant looks too large let`s consider to make it smaller?
Yeah, I think smaller may be better. As things stand, this will create a ~100kb string on the stack, which is rather a lot of stack used in one go, but someone with better undersetanding of the code may have a more informed opinion.

================
Comment at: llvm/tools/llvm-dwarfutil/llvm-dwarfutil.cpp:332
+  if (Error Err = writeToOutput(
+          Config.Common.OutputFilename, [&](raw_ostream &OutFile) -> Error {
+            raw_crc_ostream CRCBuffer(OutFile);
----------------
avl wrote:
> jhenderson wrote:
> > Is the trailing return type needed?
> I think it is always useful to type it explicitly. It clearly shows return type. Otherwise you need to compile the lambda by itself.  If it is against of guideline - then would remove it.
> Otherwise you need to compile the lambda by itself

I don't understand this comment.

> If it is against of guideline - then would remove it.

It's because it's superfluous noise - I'm not sure there's a specific coding standard on it though. Since this lambda is being passed directly into a function, you don't need to know what its return type is due to how you use it later. Within the function, it's clear that the return type is `Error`, so it doesn't really help with readability of the function either, in my opinion. Consequently, the extra code means it's harder for me to read everything.

================
Comment at: llvm/tools/llvm-dwarfutil/llvm-dwarfutil.cpp:416-417
+    } else
+      return createStringError(std::errc::invalid_argument,
+                               "possible broken debug info");
+
----------------
jhenderson wrote:
> Test case?
Not addressed?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86539/new/

https://reviews.llvm.org/D86539