[PATCH] D93747: Rename debug linkage name with -funique-internal-linkage-names

Sriraman Tallam via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Jan 5 19:33:42 PST 2021


tmsriram added a comment.

In D93747#2481163 <https://reviews.llvm.org/D93747#2481163>, @dblaikie wrote:

> In D93747#2481095 <https://reviews.llvm.org/D93747#2481095>, @hoy wrote:
>
>> In D93747#2481073 <https://reviews.llvm.org/D93747#2481073>, @dblaikie wrote:
>>
>>> In D93747#2481053 <https://reviews.llvm.org/D93747#2481053>, @tmsriram wrote:
>>>
>>>> In D93747#2480442 <https://reviews.llvm.org/D93747#2480442>, @dblaikie wrote:
>>>>
>>>>> @tmsriram - any ideas what your case/example was like that might've caused degraded debugging experience? Would be good to understand if we're producing some bad DWARF with this patch/if there might be some way to avoid it (as it seems like gdb can handle mangled names/overloading in C in this example I've tried)
>>>>
>>>> I haven't seen anything that caused degraded debugging experience.  I am interested in this as we do look at this field for the purposes of profile attribtution for calls that are inlined.  My comment was that we don't need to create this if it didn't already exist.  I am not fully aware of what would happen if we did it all the time.
>>>
>>> Ah, sorry, I got confused as to who's comment I was reading. I see it was @hoy who said:
>>>
>>>> If set, the gdb debugger will use that field to match the user input and set breakpoints. Therefore, giving DW_AT_linkage_name a uniquefied name prevents the debugger from setting a breakpoint based on source names unless the user specifies a decorated name.
>>>>
>>>> Hence, this approach sounds like a workaround for us when the profile quality matters more than debugging experience.
>>>
>>> So I'm still a bit confused. My test doesn't seem to demonstrate the issue with setting a linkage name preventing the debugger from setting a breakpoint based on the source name?
>>>
>>> Suggesting that gdb isn't using the DW_AT_name at all for "break <function name>" but instead demangling and stripping off the extras from the linkage name, and since it can't demangle this uniquified name (unlike the mangled name used when using the overloadable attribute) that degrades the debugger user experience? I'd have to test that in more detail/with some hand-hacked DWARF.
>>
>> Yes, I think in your case the linage name can be demangled by the debugger. In my previous experiment, the uniquefied names could not be demangled therefore I was not able to breakpoint.
>
> Ah, did some more testing. Seems it's because it isn't a classically mangled name.

Yep, that's the issue

  $ c++filt _Z3foov.__uniq
  foo() [clone .__uniq]
  
  $ c++filt _Z3foov.__uniq.123
  foo() [clone .__uniq.123]
  
  $ c++filt._Z3foov.__uniq.123abc
  _Z3foov.__uniq.123abc

The demangler does not understand a mix of numeric and alpha-numeric. It can be either but not both and md5hashes contain both.  So we might have to process the hash string or do something different to keep it demangler friendly.

> It might be possible for this uniquifying step to check if the name is mangled (does it start with "_Z") and if it isn't mangled, it could mangle it (check the length of the name, then "_Z<length><name>v") and add the unique suffix. That looks to me like it preserves the original debugging behavior?
>
> (has the problem that we don't actually know the mangling scheme at this point - we do know it up in clang (where it might be Itanium mangling or Microsoft mangling), might point again to the possibility this feature is more suitable to implement in the frontend rather than a middle end pass. And also the 'v' in the mangling is 'void return type', which is a lie - not sure if we could do something better there)
>
> I think it's pretty important that we don't break tradeoff debuggability for profile accuracy. It'll make for a difficult tradeoff for our/any users.

Agreed, I didn't expect this.

> (side note: using the original source file name seems problematic - I know in google at least, the same source file is sometimes built into more than one library/form with different preprocessor defines, so this may not produce as unique a name as you are expecting?)

It was a best effort and I think the hashing can be improved by using more signals other than just the module name if needed.  For hard data though, this does significantly improve performance which clearly comes from better profile attribution so it does something.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93747/new/

https://reviews.llvm.org/D93747



More information about the cfe-commits mailing list