[PATCH] D104271: llvm-dwarfdump: Print warnings on invalid DWARF
James Henderson via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Jun 18 01:00:37 PDT 2021
jhenderson added a comment.
Not had time to review the test cases yet, but will do that next week.
================
Comment at: llvm/include/llvm/DebugInfo/DWARF/DWARFContext.h:372-375
+ static const char *getAddressSizeSupported() {
+ static const char s[] = "2, 4, 8";
+ return s;
+ }
----------------
I'd rename this function to `getSupportedAddressSizes()`, which reads slightly better.
Also, it can be simplified, as shown in line. String literals don't have lifetime issues, so there's no need for the static local variable.
================
Comment at: llvm/lib/DebugInfo/DWARF/DWARFDebugAbbrev.cpp:74
+ bool Dash;
+ auto flush = [&]() {
+ if (!LastCode)
----------------
Lambdas follow variable naming style, so `flush` -> `Flush`.
================
Comment at: llvm/lib/DebugInfo/DWARF/DWARFDebugAbbrev.cpp:103
+}
+
DWARFDebugAbbrev::DWARFDebugAbbrev() { clear(); }
----------------
jankratochvil wrote:
> This function look to me as too much code for too little benefit but when @dblaikie has requested it then why not.
>
I'm not entirely convinced we need to print all valid abbrev values, even in ranged form, since typically, the abbrevs will go from 1-max in a contiguous set. That being said, I'm not opposed to it.
I think the code could be simplified anyway. Something like the following is more readable to me. It also handles adjacent codes being non-contiguous within the set of abbrevs:
```
std::string DWARFAbbreviationDeclarationSet::getCodeRange() const {
// Create a sorted list of all abbrev codes.
std::vector<uint32_t> Codes;
Codes.reserve(Decls.size());
std::transform(Decls.begin(), Decls.end(), std::back_inserter(Codes),
[](const DWARFAbbreviationDeclaration &Decl){
return Decl.getCode();
});
std::sort(Codes.begin(), Codes.end());
std::string Buffer = "";
raw_string_ostream Stream(Buffer);
// Each iteration through this look represents a single contiguous range in the set of codes.
for(auto Current = Codes.begin(), End = Codes.end(); Current != End;) {
uint32_t RangeStart = *Current;
// Add the current range start.
Stream << *Current;
uint32_t RangeEnd = RangeStart;
// Find the end of the current range.
while(++Current != End && *Current == RangeEnd + 1)
++RangeEnd;
// If there is more than one value in the range, add the range end too.
if (RangeStart != RangeEnd)
Stream << "-" << RangeEnd;
// If there is at least one more range, add a separator.
if (Current != End)
Stream << ", ";
}
return Buffer;
}
```
================
Comment at: llvm/lib/DebugInfo/DWARF/DWARFDebugInfoEntry.cpp:37
+ createStringError(errc::invalid_argument,
+ "DWARF unit [0x%8.8" PRIx64 ", 0x%8.8" PRIx64 ") "
+ "tries to read DIEs at offset 0x%8.8" PRIx64,
----------------
It's not going to be clear to the end user that these two values represent offsets. I'd be more explicit: "DWARF unit from offset x to offset y ..."
Same applies below.
================
Comment at: llvm/lib/DebugInfo/DWARF/DWARFDebugInfoEntry.cpp:55
+ "contains invalid abbreviation set "
+ "offset 0x%" PRIx64 " at offset 0x%8.8" PRIx64,
+ U.getOffset(), U.getNextUnitOffset(),
----------------
It's not clear to me (when reading the message without the code context) what these last two offsets represent. I suspect that the last offset is actually unnecessary, since the OffsetPtr location for this case is going to be fixed within the unit header, if I'm not mistaken.
================
Comment at: llvm/lib/DebugInfo/DWARF/DWARFUnit.cpp:400
DWARFDataExtractor DebugInfoData = getDebugInfoExtractor();
+ // It has been already checked by DWARFUnitHeader::extract.
+ assert(DebugInfoData.isValidOffset(NextCUOffset - 1));
----------------
I think this is a little clearer.
================
Comment at: llvm/test/tools/llvm-dwarfdump/X86/debug-entry-invalid.s:1
+# Test llvm-dwarfdump detects and reports invalid DWARF format of the file.
+
----------------
Nit: I'm trying to encourage new tests to use '##' for comments, to help distinguish them from lit and FileCheck directives.
================
Comment at: llvm/test/tools/llvm-dwarfdump/X86/format-warnings.s:3
+# RUN: | llvm-dwarfdump - 2>&1 | FileCheck --check-prefix=CHECK-CUEND %s
+# CHECK-CUEND: warning: DWARF compile unit extends beyond its bounds cu 0x00000000 at 0x0000001f
+
----------------
jankratochvil wrote:
> jhenderson wrote:
> > It would probably be a good idea if you used a non-zero value for the cu index. That will help flag up any conversion issues (e.g. because you used a 32-bit print format for a 64-bit number).
> >
> > Same goes for the remaining messages.
> >
> > Here and below, you can drop the "CHECK-" bit of the check prefixes, to make them more concise.
> > It would probably be a good idea if you used a non-zero value for the cu index.
>
> Done.
>
> > That will help flag up any conversion issues (e.g. because you used a 32-bit print format for a 64-bit number).
>
> That would not show anything more as on 64-bit platforms variadic function extends all parameters to (at least) 64 bits. Therefore even 32-bit format will still read 64-bit variadic parameter.
>
Right, but not all supported platforms are 64-bit. I've actually seen bugs precisely because of this sort of issue in similar code.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D104271/new/
https://reviews.llvm.org/D104271
More information about the llvm-commits
mailing list