[PATCH] D27642: DebugInfo: Added support for Checksum debug info feature (LLVM IR part)

Amjad Aboud via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Dec 12 16:43:56 PST 2016


aaboud added a comment.

Thanks for all the reviews and suggestions.
I just need to hear that all of you agree to one of the suggested solutions.



================
Comment at: docs/LangRef.rst:4018
+                     data: "000102030405060708090a0b0c0d0e0f")
+
 .. _DIBasicType:
----------------
mehdi_amini wrote:
> aaboud wrote:
> > aprantl wrote:
> > > aaboud wrote:
> > > > mehdi_amini wrote:
> > > > > rnk wrote:
> > > > > > aaboud wrote:
> > > > > > > mehdi_amini wrote:
> > > > > > > > What are the valid list of "type"? What type is the "type" field? Why the prefix `ChecksumType_` and not just `MD5`?
> > > > > > > > Why not a simpler approach where the type is an MDString? (That already how you parse it in DIChecksum::getChecksumType anyway)
> > > > > > > Right now the valid list of types are:
> > > > > > > ChecksumType_MD5
> > > > > > > ChecksumType_SHA1
> > > > > > > 
> > > > > > > I will add this to the document.
> > > > > > > 
> > > > > > > Answering your other questions:
> > > > > > > 1. The current solution does not handle the type as string, but as an enumeration with a known token (see LLexer.cpp)
> > > > > > > 2. Omitting the prefix ChecksumType_ will not allow us to recognize the token.
> > > > > > > 3. This is similar to what is done with DIFlag...
> > > > > > > 4. Allowing the user to write the type as string "MD5" is easier to implement, however, checking invalid types will not be elegant as this solution.
> > > > > > > 
> > > > > > > Bottom line, I am ready to go with either solutions.
> > > > > > > Let's hear what others think.
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > I think we could fold DIChecksum into DIFile. It would look like:
> > > > > >   !DIFile(filename: ..., checksumkind: ChecksumType_MD5, checksum: !"deadbeef")
> > > > > > 
> > > > > > This adds a dead integer to DIFile when checksums aren't being used, but it's probably more efficient and less work than having a separate node.
> > > > > I rather have a much simpler implementation, the validation can be done in the verifier easily and it would handle similarly in-memory validation, bitcode and textual IR (the first two are the most important by the way, textual IR is not what we should optimize for).  
> > > > > 
> > > > > With the change that @rnk suggested that would make it quite a smaller patch I believe.
> > > > Fine with me.
> > > I'm in favor of rolling it into DIFile. As Paul noted, DWARF 5 also allows for an MD5 checksum on each file entry in the line table section so I'm expecting this to become the normal case eventually.
> > > 
> > > What's the problem with recognizing MD5 as a context-sensitive token in the parser?
> > It is not only MD5, right? should the LLexer recognize all possible types of checksum, i.e. MD5, SHA1, etc.?
> > Or like it is implemented in this patch, it only needs to recognize the checksum type prefix (whatever it will be).
> > 
> > So, how do you prefer to implement this?
> > 1. Prefix - (if yes, what should be the prefix?)
> > 2. Recognize all types
> > 3. Have a wild string
> Alternative representation:
> 
> `!DIFile(filename: ..., checksum: !"MD5:deadbeef")`
> 
> 
Once again, I do not mind how we implement it, I just need to hear an agreement from most of the participants in this review.
however, I think that with this approach it might complicate the code in CodeViewDebug, as it needs to parse the checksum string to get the type.

I would like to hear Reid opinion on this suggestion before I go and implement it.


================
Comment at: lib/Bitcode/Reader/BitcodeReader.cpp:2759
+        GET_OR_DISTINCT(DIChecksum,
+                        (Context, (DIChecksum::ChecksumType)Record[1],
+                         getMDString(Record[2]))),
----------------
mehdi_amini wrote:
> What if the value in `Record[1]` is unexpected?
Do you suggest to do this instead?

```
static_cast<DIChecksum::ChecksumType>(Record[1])
```



================
Comment at: lib/IR/Verifier.cpp:942
+  AssertDI(N.getTag() == dwarf::DW_TAG_null, "invalid tag", &N);
+  AssertDI(N.getType() != DIChecksum::None, "invalid checksum type", &N);
+}
----------------
mehdi_amini wrote:
> The check could be more strict: what if the bitcode contained an invalid ID?
I am not sure if you are asking about the getTag() or the getType(), so I will answer for both.
1. If we fold the checksum data into the DIFile metadata, then there will be no need for the tag here.
2. Do you suggest that we check for valid range of check sum types? something like this:

```
(N.getType() >= DIChecksum::FirstValidType && N,getType() <= DIChecksum::LastValidType)
```


https://reviews.llvm.org/D27642





More information about the llvm-commits mailing list