[PATCH] D63662: Changing CodeView Debug info Type record output

Mon Jul 1 18:49:31 PDT 2019

nilanjana_basu marked 13 inline comments as done.
nilanjana_basu added inline comments.

================
Comment at: llvm/include/llvm/DebugInfo/CodeView/CodeViewRecordIO.h:38
+
 class CodeViewRecordIO {
   uint32_t getCurrentOffset() const {
----------------
rnk wrote:
> Please add some comments describing the three modes of this class.
Comments will be added in the next patch beside the constructor for each mode.

================
Comment at: llvm/lib/DebugInfo/CodeView/TypeRecordMapping.cpp:48-52
+  if (IO.isStreaming()) {
+    error(IO.mapStringZ(Name));
+    if (HasUniqueName)
+      error(IO.mapStringZ(UniqueName));
+  } else if (IO.isWriting()) {
----------------
rnk wrote:
> We have to be careful here, this could introduce a bug when long class names are involved. The long-type-name.ll test case exists to exercise this code. I believe it doesn't fail because we write the type once, then read it, then stream it, so it gets truncated the first time around.
> 
> Is there anything wrong with using the isWriting() code path here, or should we use the !isReading pattern? It seems the same.
Right. The streaming mode comes after the writing followed by reading mode, so the names should already be truncated by then. 
Yes, there was duplicate code. Fixed it.

================
Comment at: llvm/lib/DebugInfo/CodeView/TypeRecordMapping.cpp:269
   uint16_t Size;
-  if (IO.isWriting()) {
+  if (!IO.isReading()) {
     ArrayRef<VFTableSlotKind> Slots = Record.getSlots();
----------------
dblaikie wrote:
> rnk wrote:
> > nilanjana_basu wrote:
> > > dblaikie wrote:
> > > > What else can you be doing other than reading or writing? (what is the semantic difference in this (& similar) code change?)
> > > There are 3 modes - reading (for deserializing), writing (for serializing), & streaming (which uses MCStreamer to emit directives to the assembly file). 
> > I think this makes some sense, since the new mode is "streaming", which is notionally similar to writing. If we're "not reading" then we are producing some kind of output, either bytes or symbolic structured data.
> Hmm, I couldn't really see (from as much as a glance as I managed) how different the streaming and writing codepaths are - given that DWARF doesn't have this distinction. Could you describe why CodeView does? (what's different about the streaming V writing that motivates the need for both - having conditional code around computing complex comments is fine, but I wouldn't expect many other differences)
The "streaming" code path uses the MCStreamer interface to write meaningful representation of CodeView debug info type records, whereas the "writing" code path uses BinaryStreamWriter interface to serialize CodeView debug info type records to a set of bytes. There is little difference between both these paths at the high level of TypeRecordMapping (mostly taking care of large names by truncating them for "writing" mode, that is not required for streaming mode), & most of the difference is seen during its implementation at the CodeViewRecordIO level, where they both use different interfaces for emitting output.

================
Comment at: llvm/test/DebugInfo/COFF/asm.ll:2
 ; RUN: llc -mcpu=core2 -mtriple=i686-pc-win32 -O0 < %s | FileCheck --check-prefix=X86 %s
+; RUN: llc -mcpu=core2 -mtriple=i686-pc-win32 -o - -O0 -filetype=obj < %s | llvm-readobj -S --sr --codeview | FileCheck --check-prefix=OBJ32 %s
 ; RUN: llc -mcpu=core2 -mtriple=i686-pc-win32 -o - -O0 < %s | llvm-mc -triple=i686-pc-win32 -filetype=obj | llvm-readobj -S --sr --codeview | FileCheck --check-prefix=OBJ32 %s
----------------
nilanjana_basu wrote:
> dblaikie wrote:
> > rnk wrote:
> > > rnk wrote:
> > > > nilanjana_basu wrote:
> > > > > dblaikie wrote:
> > > > > > Seems like overkill to double all these tests, perhaps?
> > > > > These tests emit & check the same end result, but they go through different paths of the code. One creates the code view record from the object file directly, & checks it. The other one creates the assembly file (the current changes get executed in this path), converts it to object file, & then checks the code view record from it. This is to double check that both the newly added code, & the original code path work correctly. Also, when I run all the tests in COFF using llvm-lit, there is a delay of 0.6s added (originally runtime 1.4s, now takes 2s).
> > > > My recommendation was to test both codepaths, direct object emission and via textual assembly and confirm that info comes out the same. But, this test and many others don't exercise type information, so this extra run isn't really testing anything.
> > > > 
> > > > I would recommend reverting changes to test files that don't contain any lines matching `LF_`, since those are an indication of the presence of type records.
> > > I think I would be fine with leaving the extra testing in then if it's short. Or you could remove it. Either way is fine with me.
> > Yeah, I tend to be a bit over-fussy about testing - in this case my premise is usually "could you write a bug that causes one path to fail and not the other & that wouldn't be caught by another test" - I'd have expected the difference between writing and streaming (if there needs to be one - short of skipping computing complex comments) to be at a relatively low layer of the stack - so many choices about the ways types are described in CodeView might happen up in common layers that are shared between streaming and writing. So having lots of tests that retest those same codepaths would be sub-optimal to me.
> > 
> > Not just with regards to the runtime cost of the tests (though that's part of it - moreso than the wall time on our machines when fully parallelized - but the buildbots, sanitizers, etc, and the sort of "death by a thousand cuts" situation of testing takes longer when lots of these choices are made) but also the maintenance of them - if you change the output in some way and have to touch more tests because of this, get screen fulls of failures that are all about the same bug, etc.
> Have reverted redundant tests (files that do not have type records). Will submit in the next patch.
Point taken. According to my understanding, the "streaming" path is used while writing to the assembly file, whereas it is not used while emitting the debug info when directly compiled to the object file. I wanted to check that the code view debug info record retrieved from the object file that was formed directly & from the object file compiled from the assembly, should be the same. This will test that the streamer mode is emitting the correct output. 

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D63662/new/

https://reviews.llvm.org/D63662