<html>
<head>
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
</head>
<body text="#000000" bgcolor="#FFFFFF">
I've put a WIP patch up here: <a class="moz-txt-link-freetext" href="https://reviews.llvm.org/D44668">https://reviews.llvm.org/D44668</a><br>
Sorry for the delay!<br>
Erik<br>
<br>
<div class="moz-cite-prefix">On 2018-01-26 3:56 PM, Greg Clayton
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:997A2F63-CF9B-4246-A5D2-C32A29CB9FD6@gmail.com">
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
<br class="">
<div>
<blockquote type="cite" class="">
<div class="">On Jan 26, 2018, at 8:38 AM, Erik Pilkington
<<a href="mailto:erik.pilkington@gmail.com" class=""
moz-do-not-send="true">erik.pilkington@gmail.com</a>>
wrote:</div>
<br class="Apple-interchange-newline">
<div class=""><br style="font-family: Menlo-Regular;
font-size: 12px; font-style: normal; font-variant-caps:
normal; font-weight: normal; letter-spacing: normal;
text-align: start; text-indent: 0px; text-transform: none;
white-space: normal; word-spacing: 0px;
-webkit-text-stroke-width: 0px;" class="">
<br style="font-family: Menlo-Regular; font-size: 12px;
font-style: normal; font-variant-caps: normal;
font-weight: normal; letter-spacing: normal; text-align:
start; text-indent: 0px; text-transform: none;
white-space: normal; word-spacing: 0px;
-webkit-text-stroke-width: 0px;" class="">
<span style="font-family: Menlo-Regular; font-size: 12px;
font-style: normal; font-variant-caps: normal;
font-weight: normal; letter-spacing: normal; text-align:
start; text-indent: 0px; text-transform: none;
white-space: normal; word-spacing: 0px;
-webkit-text-stroke-width: 0px; float: none; display:
inline !important;" class="">On 2018-01-25 1:58 PM, Greg
Clayton wrote:</span><br style="font-family:
Menlo-Regular; font-size: 12px; font-style: normal;
font-variant-caps: normal; font-weight: normal;
letter-spacing: normal; text-align: start; text-indent:
0px; text-transform: none; white-space: normal;
word-spacing: 0px; -webkit-text-stroke-width: 0px;"
class="">
<blockquote type="cite" style="font-family: Menlo-Regular;
font-size: 12px; font-style: normal; font-variant-caps:
normal; font-weight: normal; letter-spacing: normal;
orphans: auto; text-align: start; text-indent: 0px;
text-transform: none; white-space: normal; widows: auto;
word-spacing: 0px; -webkit-text-size-adjust: auto;
-webkit-text-stroke-width: 0px;" class="">
<blockquote type="cite" class="">On Jan 25, 2018, at 10:25
AM, Erik Pilkington <<a
href="mailto:erik.pilkington@gmail.com" class=""
moz-do-not-send="true">erik.pilkington@gmail.com</a>>
wrote:<br class="">
<br class="">
Hi,<br class="">
I'm not at all familiar with LLDB, but I've been doing
some work on the demangler in libcxxabi. It's still a
work in progress and I haven't yet copied the changes
over to ItaniumDemangle, which AFAIK is what lldb uses.
The demangler in libcxxabi now demangles the symbol you
attached in 3.31 seconds, instead of 223.54 on my
machine. I posted a RFC on my work here (<a
href="http://lists.llvm.org/pipermail/llvm-dev/2017-June/114448.html"
class="" moz-do-not-send="true">http://lists.llvm.org/pipermail/llvm-dev/2017-June/114448.html</a>),
but basically the new demangler just produces an AST
then traverses it to print the demangled name.<br
class="">
</blockquote>
Great to hear the huge speedup in demangling! LLDB
actually has two demanglers: a fast one that can demangle
99% of names, and we fall back to ItaniumDemangle which
can do all names but is really slow. It would be fun to
compare your new demangler with the fast one and see if we
can get rid of the fast demangler now.<br class="">
<blockquote type="cite" class=""><br class="">
I think a good way of making this even faster is to have
LLDB consume the AST the demangler produces directly.
The AST is a better representation of the information
that LLDB wants, and finishing the demangle and then
fishing out that information from the output string is
unfortunate. From the AST, it would be really
straightforward to just individually print all the
components of the name that LLDB wants.<br class="">
</blockquote>
This would help us to grab the important bits out of the
mangled name as well. We chop up a demangled name to find
the base name (string for std::string), containing context
(std:: for std::string) and we check if we can tell if the
function is a method (look for trailing "const" modifier
on the function) versus a top level function (since the
mangling doesn't fully specify what is a namespace and
what is a class (like in "foo::bar::baz()" we don't know
if "foo" or "bar" are classes or namespaces. So the AST
would be great as long as it is fast.<br class="">
<br class="">
<blockquote type="cite" class="">Most of the time it takes
to demangle these "symbols from hell" is during the
printing, after the AST has been parsed, because the
demangler has to flatten out all the potentially nested
back references. Just parsing to an AST should be about
proportional to the strlen of the mangled name. Since
(AFAIK) LLDB doesn't use some sections of the demangled
name often (such as parameters), from the AST LLDB could
lazily decide not to even bother fully demangling some
sections of the name, then if it ever needs them it
could parse a new AST and get them from there. I think
this would largely fix the issue, as most of the time
these crazy expansions don't occur in the name itself,
but in the parameters or return type. Even when they do
appear in the name, it would be possible to do some
simple name classification (ie, does this symbol refer
to a function) or pull out the basename quickly without
expanding anything at all.<br class="">
<br class="">
Any thoughts? I'm really not at all familiar with LLDB,
so I could have this all wrong!<br class="">
</blockquote>
AST sounds great. We can put this into the class we use to
chop us C++ names as that is really our goal.<br class="">
<br class="">
So it would be great to do a speed comparison between our
fast demangler in LLDB (in FastDemangle.cpp/.h) and your
updated libcxxabi version. If yours is faster, remove
FastDemangle and then update the llvm::ItaniumDemangle()
to use your new code.<br class="">
<br class="">
ASTs would be great for the C++ name parser,<br class="">
<br class="">
Let us know what you are thinking,<br class="">
</blockquote>
<br style="font-family: Menlo-Regular; font-size: 12px;
font-style: normal; font-variant-caps: normal;
font-weight: normal; letter-spacing: normal; text-align:
start; text-indent: 0px; text-transform: none;
white-space: normal; word-spacing: 0px;
-webkit-text-stroke-width: 0px;" class="">
<span style="font-family: Menlo-Regular; font-size: 12px;
font-style: normal; font-variant-caps: normal;
font-weight: normal; letter-spacing: normal; text-align:
start; text-indent: 0px; text-transform: none;
white-space: normal; word-spacing: 0px;
-webkit-text-stroke-width: 0px; float: none; display:
inline !important;" class="">Hi Greg,</span><br
style="font-family: Menlo-Regular; font-size: 12px;
font-style: normal; font-variant-caps: normal;
font-weight: normal; letter-spacing: normal; text-align:
start; text-indent: 0px; text-transform: none;
white-space: normal; word-spacing: 0px;
-webkit-text-stroke-width: 0px;" class="">
<br style="font-family: Menlo-Regular; font-size: 12px;
font-style: normal; font-variant-caps: normal;
font-weight: normal; letter-spacing: normal; text-align:
start; text-indent: 0px; text-transform: none;
white-space: normal; word-spacing: 0px;
-webkit-text-stroke-width: 0px;" class="">
<span style="font-family: Menlo-Regular; font-size: 12px;
font-style: normal; font-variant-caps: normal;
font-weight: normal; letter-spacing: normal; text-align:
start; text-indent: 0px; text-transform: none;
white-space: normal; word-spacing: 0px;
-webkit-text-stroke-width: 0px; float: none; display:
inline !important;" class="">I'll almost finished with my
work on the demangler, hopefully I'll be done within a few
weeks. Once that's all finished I'll look into exporting
the AST and comparing it to FastDemangle. I was thinking
about adding a version of llvm::itaniumMangle() that
returns a opaque handle to the AST and defining some
functions on the LLVM side that take that handle and
return some extra information. I'd be happy to help out
with the LLDB side of things too, although it might be
better if someone more experienced with LLDB did this.</span><br
style="font-family: Menlo-Regular; font-size: 12px;
font-style: normal; font-variant-caps: normal;
font-weight: normal; letter-spacing: normal; text-align:
start; text-indent: 0px; text-transform: none;
white-space: normal; word-spacing: 0px;
-webkit-text-stroke-width: 0px;" class="">
<br style="font-family: Menlo-Regular; font-size: 12px;
font-style: normal; font-variant-caps: normal;
font-weight: normal; letter-spacing: normal; text-align:
start; text-indent: 0px; text-transform: none;
white-space: normal; word-spacing: 0px;
-webkit-text-stroke-width: 0px;" class="">
</div>
</blockquote>
<div><br class="">
</div>
Can't wait! The only reason we switched away from the libcxxabi
demangler in the first place was the poor performance. GDB's
demangler was 3x faster. Our FastDemangler made got back to the
speed of the GDB demangler. But it will be great to get back to
one fast demangler. </div>
<div><br class="">
</div>
<div>It would be great if there was some way to implement the
demangled name size cutoff in the demangler where if the
detangled names goes over some max size we can just stop
demangling. No one needs to see a 72MB string, not would anyone
ever type in that name.</div>
<div><br class="">
</div>
<div>If you can get the new demangler features (AST + demangling)
into <span style="font-family: Menlo-Regular;" class="">llvm::itaniumMangle
I will be happy to do the LLDB side of the work</span></div>
<div><span style="font-family: Menlo-Regular;" class=""><br
class="">
</span></div>
<div>
<blockquote type="cite" class="">
<div class=""><span style="font-family: Menlo-Regular;
font-size: 12px; font-style: normal; font-variant-caps:
normal; font-weight: normal; letter-spacing: normal;
text-align: start; text-indent: 0px; text-transform: none;
white-space: normal; word-spacing: 0px;
-webkit-text-stroke-width: 0px; float: none; display:
inline !important;" class="">I'll ping this thread when
I'm finished with the demangler, then we can hopefully
work out what a good API for LLDB would be.</span><br
style="font-family: Menlo-Regular; font-size: 12px;
font-style: normal; font-variant-caps: normal;
font-weight: normal; letter-spacing: normal; text-align:
start; text-indent: 0px; text-transform: none;
white-space: normal; word-spacing: 0px;
-webkit-text-stroke-width: 0px;" class="">
</div>
</blockquote>
<div><br class="">
</div>
It would be great to put all the functionality into LLVM and
test the functionality in llvm tests. Then I will port over to
LLDB as needed. As Jim said, we want to know the function
basename, if a function is a C++ method or just a top level
function or possibly both (we often don't know just from
mangling if foo::bar() is a method of function since we don't
know if "foo" is a namespace, but if we have "foo::bar() const",
then we know it is a method.</div>
<div><br class="">
</div>
<div>Look forward to seeing what you come up with!</div>
<div><br class="">
</div>
<div>Greg</div>
<div><br class="">
</div>
<div>
<blockquote type="cite" class="">
<div class=""><br style="font-family: Menlo-Regular;
font-size: 12px; font-style: normal; font-variant-caps:
normal; font-weight: normal; letter-spacing: normal;
text-align: start; text-indent: 0px; text-transform: none;
white-space: normal; word-spacing: 0px;
-webkit-text-stroke-width: 0px;" class="">
<span style="font-family: Menlo-Regular; font-size: 12px;
font-style: normal; font-variant-caps: normal;
font-weight: normal; letter-spacing: normal; text-align:
start; text-indent: 0px; text-transform: none;
white-space: normal; word-spacing: 0px;
-webkit-text-stroke-width: 0px; float: none; display:
inline !important;" class="">Thanks,</span><br
style="font-family: Menlo-Regular; font-size: 12px;
font-style: normal; font-variant-caps: normal;
font-weight: normal; letter-spacing: normal; text-align:
start; text-indent: 0px; text-transform: none;
white-space: normal; word-spacing: 0px;
-webkit-text-stroke-width: 0px;" class="">
<span style="font-family: Menlo-Regular; font-size: 12px;
font-style: normal; font-variant-caps: normal;
font-weight: normal; letter-spacing: normal; text-align:
start; text-indent: 0px; text-transform: none;
white-space: normal; word-spacing: 0px;
-webkit-text-stroke-width: 0px; float: none; display:
inline !important;" class="">Erik</span><br
style="font-family: Menlo-Regular; font-size: 12px;
font-style: normal; font-variant-caps: normal;
font-weight: normal; letter-spacing: normal; text-align:
start; text-indent: 0px; text-transform: none;
white-space: normal; word-spacing: 0px;
-webkit-text-stroke-width: 0px;" class="">
<br style="font-family: Menlo-Regular; font-size: 12px;
font-style: normal; font-variant-caps: normal;
font-weight: normal; letter-spacing: normal; text-align:
start; text-indent: 0px; text-transform: none;
white-space: normal; word-spacing: 0px;
-webkit-text-stroke-width: 0px;" class="">
<blockquote type="cite" style="font-family: Menlo-Regular;
font-size: 12px; font-style: normal; font-variant-caps:
normal; font-weight: normal; letter-spacing: normal;
orphans: auto; text-align: start; text-indent: 0px;
text-transform: none; white-space: normal; widows: auto;
word-spacing: 0px; -webkit-text-size-adjust: auto;
-webkit-text-stroke-width: 0px;" class="">Greg<br class="">
<br class="">
<blockquote type="cite" class="">Thanks,<br class="">
Erik<br class="">
<br class="">
<br class="">
On 2018-01-24 6:48 PM, Greg Clayton via lldb-dev wrote:<br
class="">
<blockquote type="cite" class="">I have an issue where I
am debugging a C++ binary that is around 250MB in
size. It contains some mangled names that are crazy:<br
class="">
<br class="">
_ZNK3shk6detail17CallbackPublisherIZNS_5ThrowERKNSt15__exception_ptr13exception_ptrEEUlOT_E_E9SubscribeINS0_9ConcatMapINS0_18CallbackSubscriberIZNS_6GetAllIiNS1_IZZNS_9ConcatMapIZNS_6ConcatIJNS1_IZZNS_3MapIZZNS_7IfEmptyIS9_EEDaS7_ENKUlS6_E_clINS1_IZZNS_4TakeIiEESI_S7_ENKUlS6_E_clINS1_IZZNS_6FilterIZNS_9ElementAtEmEUlS7_E_EESI_S7_ENKUlS6_E_clINS1_IZZNSL_ImEESI_S7_ENKUlS6_E_clINS1_IZNS_4FromINS0_22InfiniteRangeContainerIiEEEESI_S7_EUlS7_E_EEEESI_S6_EUlS7_E_EEEESI_S6_EUlS7_E_EEEESI_S6_EUlS7_E_EEEESI_S6_EUlS7_E_EESI_S7_ENKUlS6_E_clIS14_EESI_S6_EUlS7_E_EERNS1_IZZNSH_IS9_EESI_S7_ENKSK_IS14_EESI_S6_EUlS7_E0_EEEEESI_DpOT_EUlS7_E_EESI_S7_ENKUlS6_E_clINS1_IZNS_5StartIJZNS_4JustIJS19_S1C_EEESI_S1F_EUlvE_ZNS1K_IJS19_S1C_EEESI_S1F_EUlvE0_EEESI_S1F_EUlS7_E_EEEESI_S6_EUlS7_E_EEEESt6vectorIS6_SaIS6_EERKT0_NS_12ElementCountEbEUlS7_E_ZNSD_IiS1Q_EES1T_S1W_S1X_bEUlOS3_E_ZNSD_IiS1Q_EES1T_S1W_S1X_bEUlvE_EES1G_S1O_E25ConcatMapValuesSubscriberEEEDaS7_<br
class="">
<br class="">
This de-mangles to something that is 72MB in size and
takes 280 seconds (try running "time c++filt -n" on
the above string).<br class="">
<br class="">
There are probably many symbols likes this in this
binary. Currently lldb will de-mangle all names in the
symbol table so that we can chop up the names so we
know function base names and we might be able to
classify a base name as a method or function for
breakpoint categorization.<br class="">
<br class="">
My questions is: how do we work around such issues in
LLDB? A few solutions I can think of:<br class="">
1 - time each name demangle and if it takes too long
somehow stop de-mangling similar symbols or symbols
over a certain length?<br class="">
2 - allow a setting that says "don't de-mangle names
that start with..." and the setting has a list of
prefixes.<br class="">
3 - have a setting that turns off de-mangling symbols
over a certain length all of the time with a default
of something like 256 or 512<br class="">
4 - modify our FastDemangler to abort if the
de-mangled string goes over a certain limit to avoid
bad cases like this...<br class="">
<br class="">
#1 would still mean we get a huge delay (like 280
seconds) when starting to debug this binary, but might
prevent multiple symbols from adding to that delay...<br
class="">
<br class="">
#2 would require debugging debugging once and then
knowing which symbols took a while to de-mangle. If we
time each de-mangle, we can warn that there are large
mangled names and print the mangled name so the user
might know?<br class="">
<br class="">
#3 would disable de-mangling of long names at the risk
of not de-mangling names that are close to the limit<br
class="">
<br class="">
#4 requires that our FastDemangle code can decode the
string mangled string. The fast de-mangler currently
aborts on tricky de-mangling and we fall back onto
cxa_demangle from the C++ library which doesn't not
have a cutoff on length...<br class="">
<br class="">
Can anyone else think of any other solutions?<br
class="">
<br class="">
Greg Clayton<br class="">
<br class="">
<br class="">
<br class="">
<br class="">
<br class="">
<br class="">
_______________________________________________<br
class="">
lldb-dev mailing list<br class="">
<a href="mailto:lldb-dev@lists.llvm.org" class=""
moz-do-not-send="true">lldb-dev@lists.llvm.org</a><br
class="">
<a class="moz-txt-link-freetext" href="http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev">http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev</a></blockquote>
</blockquote>
</blockquote>
</div>
</blockquote>
</div>
<br class="">
</blockquote>
<br>
</body>
</html>