[PATCH] DW_TAG_typedef and DW_TAG_structure_type not emitted when typedef is used

Tue Sep 16 23:40:30 PDT 2014

>>! In D2498#10, @jyoti.yalamanchili wrote:
> Hi David,
> 
>>>! In D2498#9, @dblaikie wrote:
>>>>! In D2498#4, @jyoti.yalamanchili wrote:
>>> This bug appears only when typedef is used.
>> 
>> Doesn't look like it.
> This was my previous analysis which was incorrect. This issue occurs whenever explicit casting occurs. The type to which we are explicitly casting is not emitted in debug info.
>> 
>> Changing the code to cast to (S*) instead of (T) still demonstrates the bug and is still present even with this patch applied.
>> 
> As you mentioned, I verified this for the following code snippet where i am replacing T with struct S* and the result was that DW_TAG_structure_type & DW_TAG_member were emitted but not DW_TAG_typedef because T was not used. I thought this is expected behaviour. Please correct me if i am wrong.

Looks like I was confused (perhaps by the old title of this code review which still mentions typedefs - and perhaps by Phab showing me the newer diff, or maybe I was looking in the wrong place somehow and saw the old code which still special-cased for typedefs). Now that I apply your latest patch properly I see the behavior you describe - this fixes the issue regardless of whether a typedef is used.

> 
> typedef struct S {
>   int i;
> } *T;
> #define M(p) (**(struct S*)**(p))
> 
> void foo(void *p) { M(p)->i++; }
> 
> 
>> The problem is more general than typedefs and casts - arguably we should get this (C++11) case too:
>> 
>>   struct S { int i; };
>>   void func(S);
>>   void func() {
>>     func(S{42});
>>   }
>> 
>> but GCC doesn't get this either, so I'm less concerned.
>> 
>> GCC does get the cast case - with or without a typedef. So perhaps we can just deal with casts for now... 
>> 
> Okay. 
> 
>> The only other thing I can think of is far more invasive/more work - to get this all right, what we really need is a tree of "completeness" - keeping track of any time a type is required to be complete (not just a single bit that says "this has been required to be complete") & then emitting any type for which it was requried to be complete during a function we codegen'd.
>> 
> Sorry, could you please explain this with an example ?

  struct foo { };
  inline void func() { foo f; } // this makes 'f' required to be complete, which means we emit a definition of 'f' if we ever need any reference to 'f'
  foo *f; // this causes us to emit 'f'

Without the second line, we just emit a declaration (in the default -fno-standalone-debug mode). 

> I could mention this in the testcase as a TODO note if you want so we don't loose track of this improvement scope.

That's OK - it's pretty much outside the scope of your change. It wouldn't really belong in this test file.

> 
>> This infrastructure would also allow us to do a better job minimizing debug info for the limited debug info optimization where we currently fail:
>> 
>>   struct foo { ... };
>>   inline void unused() { foo f; } // 'f' is required to be complete here, but the function is never called
>>   foo *f; // but that required completeness is never relied upon for this TU - a declaration would've sufficed
>> 
>> This hurts us in, for example, Clang's own Sema.h - the giant Sema class has a single non-member inline function right after it that has dereferences a Sema pointer, thus requiring Sema to be complete. From then on, if anything else in the TU including Sema.h requires Sema (evern just a pointer to it it) the debug info for the entire class will be emitted.
> 
> If i understand this correctly, point of worry here is about the increase in debug info size, am i right? 

The -fno-standalone-debug default option is an attempt to minimize debug info size, yes. The issue is that it uses /some/ part of the powers necessary to make our general debug info strategy more reliable and to fix this bug you're addressing and possible others in a more general manner. Though I can't site any particular examples similar to yours (places where a type is used, but no named instance or reference to the type is ever needed)... actually I can:

 struct S { };
 void *v = new S;

Clang doesn't emit debug info describing 'S' (but then again, neither does GCC). I imagine maybe there's a few other pieces of syntax that allow one to reference a type without ever naming a variable of that type.

> But, we certainly need the debug info of any referenced variables isn't ?

Sure - in this case there was no variable, but it's still reasonable to describe the type. The rough heuristic is: Any expression written in executing code (global intializers, functions that are actually codegen'd, etc) should be able to be written and executed in a sufficiently advanced debugger given the information Clang produces. 

Any cases where the information is fundamentally not there - are bugs.

================
Comment at: llvm/tools/clang/lib/CodeGen/CGExprScalar.cpp:279
@@ +278,3 @@
+    CGDebugInfo *DI = CGF.getDebugInfo();
+    if (DI &&
+        CGF.CGM.getCodeGenOpts().getDebugInfo() >=
----------------
Roll the DI variable into the if condition and sink the LimitedDebugInfo check into the DI function (at least I think that's how we are/should be doing this - keeping as much of the implementation details down in CGDebugInfo rather than up in the CodeGen callsites)

================
Comment at: llvm/tools/clang/test/CodeGen/Debug-info-explicitcast.c:4
@@ +3,3 @@
+typedef struct S { int i; } *T;
+#define M(p) ((T)(p))
+
----------------
Could you drop the macro usage and the typedef as they're not needed to demonstrate the fundamental bug here?

http://reviews.llvm.org/D2498