[llvm-dev] RFC: Supporting all entities declared in lexical scope in LLVM debug info
Aboud, Amjad via llvm-dev
llvm-dev at lists.llvm.org
Wed Nov 18 06:57:59 PST 2015
Hi,
I would like to implement a fix to how LLVM handles/creates debug info for entities declared inside a basic block.
Below you will find 5 parts:
1. Motivation for this fix.
2. Background explaining the cases that need to be fixed.
3. An example for each case.
4. Proposal on how to represent each case in dwarf.
5. Secondary (workaround) proposal which might be needed in the short term until applying a fix in GDB debugger.
Please, let me know if you have any comment or feedback on this approach.
Thanks,
Amjad
Motivation
Current implementation causes loss of debug info, even when optimizations are disabled.
For example:
int foo(bool b) {
if (b) {
typedef int A;
class B { public: int x; };
B y;
static A z = 0;
return y.x + z++;
} else {
typedef float A;
class B { public: float x; };
B y;
static A z = 0.0;
return (int)(y.x + z++);
}
}
In the above example, debugger will not be able to fetch the correct value for types "A" and "B" and for local static variable "z", however it will work fine with local variable "y".
The local static variable has an open ticket in LLVM Bugzilla:
https://llvm.org/bugs/show_bug.cgi?id=19238
Background
These are all the entities that can be declared inside a lexical scope (i.e. function or basic block):
1. Local variable
2. Local static variable
3. Imported Entity
a. Imported declaration
b. Imported module
4. Type
a. Record (structure, class, union)
b. Typedef
In the current LLVM implementation only (1) the local variables are handled correctly.
(2) local static variables and (4) Types are associated in clang-FE to the function lexical scope even when they are declared in an internal basic block.
(3) Imported Entity is handled differently depends on where it is declared, in function lexical scope or in basic block scope.
There are two interesting cases where we should consider the representation in dwarf for the above entities:
a. Function was not inlined - has one concrete entry.
b. Function was inlined - has one abstract entry, one or more inlined entries, and possibly one concrete entry.
The goal is to be able to debug all the above entities in case (a) - which is the common case when compile with no optimizations. But also, we would like to be able to debug as many entities as we can also in case (b), the optimized mode.
Example
These are two examples that represent cases (a) and (b) with all the entities mentioned above.
// (Case a)
namespace N {
class D;
};
int foo() {
{
using namespace N;
using N::D;
typedef int A;
class B { public: int x; };
B y;
static A z = 0;
return y.x + z++;
}
}
// (Case b)
__attribute__(always_inline)
int foo(bool b) {
// same as case (a)
}
int bar() {
return foo();
}
Proposed representation in dwarf
Case (a) - There is only one concrete function with one lexical block (DW_TAG_lexical_block) entry. Each entity will have a dwarf entry placed under the lexical block scope the same as appear in the source.
(1) DW_TAG_subprogram (concrete)
DW_AT_name (= "foo")
DW_AT_low_pc
DW_AT_low_high
(2) LexicalBlock
DW_AT_low_pc
DW_AT_low_high
(3) DW_TAG_imported_module
DW_AT_import (=> N)
(3) DW_TAG_imported_declaration
DW_AT_import (=> N::D)
(3) DW_TAG_typedef
DW_AT_name (= "A")
DW_AT_type (=> int)
(3) DW_TAG_class_type
DW_AT_name (= "B")
(4) DW_TAG_variable
DW_AT_name (= "x")
DW_AT_type (= int)
(3) DW_TAG_variable
DW_AT_name (= "y")
DW_AT_type (=> B)
DW_AT_location
(3) DW_TAG_variable
DW_AT_name (= "z")
DW_AT_type (= A)
DW_AT_location
Case (b) - There is one abstract function, one inline function (DW_TAG_inlined_subroutine) and one concrete function, each has one lexical block entry.
Each entity will have a dwarf entry placed under the lexical block scope of the abstract function. Where these entries will contain all the debug info attributes for the represented entity, except for the local variable that will be missing the location attribute (DW_AT_location), as it is not common to all inline/concrete functions.
In addition, under each lexical block entry in the inline/concrete function entry there will be:
1. Abstract origin attribute (DW_AT_abstract_origin) pointing to the equivalent lexical block entry in the abstract function.
2. Local variable entry - with abstract origin attribute pointing to the one in the abstract function, and also the location attribute.
(1) DW_TAG_subprogram (abstract)
DW_AT_name (= "foo")
DW_AT_inline
(2) LexicalBlock
(3) DW_TAG_imported_module
DW_AT_import (=> N)
(3) DW_TAG_imported_declaration
DW_AT_import (=> N::D)
(3) DW_TAG_typedef
DW_AT_name (= "A")
DW_AT_type (=> int)
(3) DW_TAG_class_type
DW_AT_name (= "B")
(4) DW_TAG_variable
DW_AT_name (= "x")
DW_AT_type (= int)
(3) DW_TAG_variable
DW_AT_name (= "y")
DW_AT_type (=> B)
(3) DW_TAG_variable
DW_AT_name (= "z")
DW_AT_type (=> A)
DW_AT_location
(1) DW_TAG_subprogram (concrete)
DW_AT_abstract_origin (=> abstract function "foo")
DW_AT_low_pc
DW_AT_low_high
(2) LexicalBlock
DW_AT_abstract_origin (=> abstract lexical block)
DW_AT_low_pc
DW_AT_low_high
(3) DW_TAG_variable
DW_AT_abstract_origin (=> abstract variable "y")
DW_AT_location
(1) DW_TAG_subprogram
DW_AT_name (= "bar")
DW_AT_low_pc
DW_AT_low_high
(2) DW_TAG_inlined_subroutine (inline)
DW_AT_abstract_origin (=> abstract function "foo")
DW_AT_low_pc
DW_AT_low_high
(3) LexicalBlock
DW_AT_abstract_origin (=> abstract lexical block)
DW_AT_low_pc
DW_AT_low_high
(4) DW_TAG_variable
DW_AT_abstract_origin (=> abstract variable "y")
DW_AT_location
This would be the optimal solution if GDB would support the "DW_AT_abstract_origin" attribute on lexical block.
The idea of having this variable on lexical block is to inform the debugger that all attributes and (direct) children of the abstract lexical block are available for the inline/concrete lexical block as well.
GDB does not support the above proposal and I suggest to open a bug on GDB debugger to support this.
Do you think otherwise?
Workaround solution
In the meanwhile and till GDB implement the above requirement, we can implement a different approach for case (b), that will allows GDB to provide information on all entities in the inline/concrete function (except for types).
The abstract function will still be the same, however the inline/concrete functions will be changed as follow:
(1) DW_TAG_subprogram (concrete)
DW_AT_abstract_origin (=> abstract function "foo")
DW_AT_low_pc
DW_AT_low_high
(2) LexicalBlock
DW_AT_low_pc
DW_AT_low_high
(3) DW_TAG_imported_module
DW_AT_import (=> N)
(3) DW_TAG_imported_declaration
DW_AT_import (=> N::D)
(3) DW_TAG_variable
DW_AT_abstract_origin (=> abstract variable "y")
DW_AT_location
(3) DW_TAG_variable
DW_AT_abstract_origin (=> abstract static variable "z")
(1) DW_TAG_subprogram
DW_AT_name (= "bar")
DW_AT_low_pc
DW_AT_low_high
(2) DW_TAG_inlined_subroutine (inline)
DW_AT_abstract_origin (=> abstract function "foo")
DW_AT_low_pc
DW_AT_low_high
(3) LexicalBlock
DW_AT_low_pc
DW_AT_low_high
(4) DW_TAG_imported_module
DW_AT_import (=> N)
(4) DW_TAG_imported_declaration
DW_AT_import (=> N::D)
(4) DW_TAG_variable
DW_AT_abstract_origin (=> abstract variable "y")
DW_AT_location
(4) DW_TAG_variable
DW_AT_abstract_origin (=> abstract static variable "z")
I already have an implementation for this proposal and the changes between the optimal solution, which needs support from GDB, and the workaround solution is minimal.
---------------------------------------------------------------------
Intel Israel (74) Limited
This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20151118/650f0e43/attachment-0001.html>
More information about the llvm-dev
mailing list