[cfe-dev] Where do we really need mangled names

Mon Feb 17 12:05:00 PST 2014

On Mon, Feb 17, 2014 at 12:00 PM, David Blaikie <dblaikie at gmail.com> wrote:
>
>
>
> On Mon, Feb 17, 2014 at 11:47 AM, Greg Clayton <gclayton at apple.com> wrote:
>>
>> From a performance standpoint, we do win when things are in the
>> accelerator tables. LLDB doesn't grub around in the DWARF unless our
>> accelerator tables are _not_ there (in which case we must manually create
>> the accelerator tables by reading all DWARF very slowly). If they are there,
>> they are assumed to be valid and we have been trying to hold clang to that
>> standard. clang must put the plain and mangled name in the DWARF for it to
>> be emitted in the accelerator tables, so we have been slowly improving clang
>> as much as we can to get the accelerator tables accurate as. So LLDB won't
>> even look in the DWARF unless it sees something in the accelerator tables
>> (the __apple_XXX accelerator tables that I create and Eric Christopher then
>> implemented in clang and in the DWARF committee).
>
>
> OK - so rather than incrementally improving this by having users file bugs
> about missing data, I'm trying to understand the underlying principles so we
> can strive to implement this correctly. (While also not including data we
> don't really need because people seem to care about debug info size)
>
> Though I wasn't actually suggesting that the accelerator tables were related
> to this, just another example of size/perf tradeoff - if they are related,
> then that's another piece of the puzzle I would like to understand so I can
> better implement these requirements in Clang/LLVM.
>
> So how does the linkage name relate to the accelerator tables? I haven't
> really looked into them at all, but I thought they only contained "public"
> names (externally visible), but maybe that's just a limitation/misfeature of
> the GNU pubnames stuff that will be address in the DWARF 5 accelerator
> tables feature/proposal.

The public part isn't really related, however, the content part of the
spec is still listed pretty accurately here:

http://llvm.org/docs/SourceLevelDebugging.html#name-accelerator-tables

which includes parts about linkage names.

>
> Eric - not sure if this is easier just to discuss in person if you have
> enough/all the context here.

Sure, we can do that.

-eric

>
>>
>>
>> Greg
>>
>> On Feb 17, 2014, at 10:24 AM, David Blaikie <dblaikie at gmail.com> wrote:
>>
>> >
>> >
>> >
>> > On Mon, Feb 17, 2014 at 10:02 AM, Greg Clayton <gclayton at apple.com>
>> > wrote:
>> >
>> > On Feb 15, 2014, at 2:57 PM, David Blaikie <dblaikie at gmail.com> wrote:
>> >
>> > > So when comparing Clang's debug info strings to GCC's I came across a
>> > > couple of disparities hinging on the inclusion of linkage names on certain
>> > > functions. Here are a few differences:
>> > >
>> > > * Clang includes linkage names on file-local (static or anon
>> > > namespace) functions. GCC does not.
>> > > * Clang does not include the linkage name of member functions of
>> > > function-local classes. GCC does, if the function is
>> > > non-static/non-anonymous namespace and inline (ie: the member function has
>> > > linkonce-odr linkage, not internal linkage)
>> > > * Clang does not include the linkage name for constructors and
>> > > destructors - this may be necessary due to the difference (GCC duplicates,
>> > > Clang has one version call the other) in implementations, but I doubt it. I
>> > > assume we still emit multiple member functions, one to describe each version
>> > > of the ctor/dtor we're emitting.
>> > >
>> > > It looks like at least for the first case, this may've been deliberate
>> > > ( http://llvm.org/viewvc/llvm-project?view=revision&revision=154570 which
>> > > doesn't explain why and points to rdar://11079003 which Jim Grosbach told
>> > > didn't have a great deal more context) but I don't have enough context to
>> > > understand why and whether it's just a GCC bug that they don't emit it, or
>> > > something tools should handle better, etc.
>> > >
>> > > So - any thoughts on the disparity and if/why it's necessary?
>> >
>> > It is necessary if you want your debugger to be able to evaluate an
>> > expression that fully qualifies the name of the static:
>> >
>> > (lldb) expr a::b::g_foo
>> >
>> > If you don't have the mangled name, you can only do:
>> >
>> > (lldb) expr g_foo
>> >
>> > A cursory experiment with GDB doesn't seem to exhibit this behavior.
>> > Have I correctly captured the nature of your example here:
>> >
>> > $ cat mang.cpp
>> > namespace x {
>> > static void func() {}
>> > }
>> >
>> > int main() { x::func(); }
>> > $ g++-4.8.1 mang.cpp -g
>> > $ gdb a.out
>> > (gdb) start
>> > Temporary breakpoint 1 at 0x40055a: file mang.cpp, line 5.
>> > Starting program: /tmp/dbginfo/a.out
>> >
>> > Temporary breakpoint 1, main () at mang.cpp:5
>> > 5       int main() { x::func(); }
>> > (gdb) p x::func()
>> > $1 = void
>> > (gdb) exit
>> > $ llvm-dwarfdump a.out | grep linkage_name
>> > $
>> >
>> > No mention of any linkage names yet the debugger appears to have been
>> > able to identify the fully qualified name and call the function.
>> >
>> >
>> > Then you might get many many many different versions if more than one
>> > file contains a static named "g_foo".
>> >
>> > I'm not quite sure what you're saying here - this is true of any static,
>> > even those in namespaces - there may be (will be) multiple across a project
>> > all meaning different things. The mangled name won't be unique to any
>> > particular one.
>> >
>> > Also, users never tend to fully qualify names and might try to execute:
>> >
>> > (lldb) expr b::g_foo
>> >
>> > From within namespace 'a' or just as a partial identifier that the user
>> > expects the debugger to search for? I'm not sure if GDB offers this
>> > functionality (I realize you're talking about LLDB, but I don't have it
>> > setup to experiment with) so I can't compare, but I don't see any reason GDB
>> > wouldn't be able to implement that without the mangled name too.
>> >
>> > expecting to see "a::b::g_foo". LLDB is able to find these variables by
>> > first looking for all the variable base names that match "g_foo", then
>> > removing any whose demangled named doesn't contain "b::g_foo".
>> >
>> > So the mangled names on statics allows debuggers to do the right thing
>> > and be able to correctly display qualified static variables and is very
>> > important for good debugging.
>> >
>> > GDB /seems/ to be getting away without it and providing similar
>> > functionality (I could be wrong - perhaps my examples aren't what you had in
>> > mind).
>> >
>> > Is it just that it's a performance optimization compared to having to
>> > walk the DIE parent chain to build a fully qualified name? If that's the
>> > case, can we quantify that perf/size tradeoff? (though at that point it's a
>> > fair question about why have the mangled name at all - I'm not really sure
>> > what GDB uses it for when it is present (on externally visible functions))
>> >
>> > - Dave
>>
>