[PATCH] D42765: [DebugInfo] Support DWARFv5 source code embedding extension

Scott Linder via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Feb 5 12:29:03 PST 2018


scott.linder marked 4 inline comments as done.
scott.linder added inline comments.


================
Comment at: include/llvm/IR/DIBuilder.h:151
+                       StringRef Checksum = StringRef(),
+                       Optional<StringRef> Source = None);
 
----------------
JDevlieghere wrote:
> scott.linder wrote:
> > JDevlieghere wrote:
> > > I'm not a fan of these two things not being consistent. Either both should be options or both should be empty StringRefs? This seems to be recurring throughout this patch. I think `Optional` better conveys the idea, but would require more API changes. 
> > I agree, but would like feedback on the approach to fixing it.
> > 
> > The new code uses an `Optional` to explicitly support an empty source string, without it being assumed to simply be absent, so that `None != ""`. While it may not have many practical uses, it seems better to be explicit.
> > 
> > The old code is also explicit, but the paired `ChecksumKind` is what encodes whether a checksum is available. I do not want to duplicate this, as it leads to inconsistencies when `CSKind != CSK_None`, but `Checksum == None`, or vice-versa.
> > 
> > The best option I see is to wrap the pair in an optional (`Optional<pair<CSKind, Checksum>>`), remove the `CSK_None` variant, and maybe make the pair an explicit struct with members for the kind and checksum. Would this be an acceptable fix?
> I like the idea of the explicit struct (`DIFile::Checksum`?) wrapped in an optional. If it's not too big of a change we might as well move the `ChecksumKind ` enum in there too?
I have started an implementation of this change, but am unsure how to deal with backwards compatibility.

The `ChecksumKind` values appear to map directly to the constants used in the Bitcode, and more critically to the constants used in CodeView debug info; I feel removing the `CSK_None` variant will obfuscate the relationship, prevent moving these constants to a `.def` file (a TODO in the code), and force some extra translation to retain compatibility.

If we are OK with this, I am still not certain where the relationship is actually defined; the current `DIFile::ChecksumKind` simply gets cast to `unsigned` before being passed to `MCStreamer::EmitCVFileDirective`. There is no mention of `codeview::FileChecksumKind`, which does not explicitly define what unsigned integer values map to each variant anyway. The current solution seems very implicit, but I am not confident enough to say it should be changed.

The issue also comes up with IR, as `CSK_None` will need to be accepted as a valid token, and will need to override any checksum string present.

Assuming these inconsistencies are OK, I don't know how to represent this explicitly in the code; where are "format upgrades" handled? Or should I just accept the old format when parsing (translating things where necessary) and only output the new format (essentially never outputting `CSK_None`).

The alternative is to retain the `CSK_None` variant, and wrap it in the `Optional`, which would allow something of the form of `Some(None, "")` to be represented, which I would like to avoid. Additionally, when encoding `None` the actual encoding would need to be defined for CodeView, and I have not yet found where that is explicitly stated.


================
Comment at: include/llvm/IR/DebugInfoMetadata.h:512
   ChecksumKind CSKind;
+  Optional<MDString *> Source;
 
----------------
aprantl wrote:
> It is probably better to make DIFile variable length so we don't have to pay for the extra pointer when we don't need it.
I'm not sure I understand what you mean by variable length here; can you explain, or give me an example to reference?


================
Comment at: lib/MC/MCParser/AsmParser.cpp:3363
+        return true;
+    } else {
+      return TokError("unexpected token in '.file' directive");
----------------
JDevlieghere wrote:
> `else` can be omitted here too. 
The `else` here is required; the intent is to support any number of named keywords, while still rejecting unknown keywords. The alternative to the `else` is to `continue` at the end of each other branch, which I think is more confusing.


Repository:
  rL LLVM

https://reviews.llvm.org/D42765





More information about the llvm-commits mailing list