<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html;

      charset=windows-1252">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    I've put a WIP patch up here: <a class="moz-txt-link-freetext" href="https://reviews.llvm.org/D44668">https://reviews.llvm.org/D44668</a><br>

    Sorry for the delay!<br>

    Erik<br>

    <br>

    <div class="moz-cite-prefix">On 2018-01-26 3:56 PM, Greg Clayton

      wrote:<br>

    </div>

    <blockquote type="cite"

      cite="mid:997A2F63-CF9B-4246-A5D2-C32A29CB9FD6@gmail.com">

      <meta http-equiv="Content-Type" content="text/html;

        charset=windows-1252">

      <br class="">

      <div>

        <blockquote type="cite" class="">

          <div class="">On Jan 26, 2018, at 8:38 AM, Erik Pilkington

            <<a href="mailto:erik.pilkington@gmail.com" class=""

              moz-do-not-send="true">erik.pilkington@gmail.com</a>>

            wrote:</div>

          <br class="Apple-interchange-newline">

          <div class=""><br style="font-family: Menlo-Regular;

              font-size: 12px; font-style: normal; font-variant-caps:

              normal; font-weight: normal; letter-spacing: normal;

              text-align: start; text-indent: 0px; text-transform: none;

              white-space: normal; word-spacing: 0px;

              -webkit-text-stroke-width: 0px;" class="">

            <br style="font-family: Menlo-Regular; font-size: 12px;

              font-style: normal; font-variant-caps: normal;

              font-weight: normal; letter-spacing: normal; text-align:

              start; text-indent: 0px; text-transform: none;

              white-space: normal; word-spacing: 0px;

              -webkit-text-stroke-width: 0px;" class="">

            <span style="font-family: Menlo-Regular; font-size: 12px;

              font-style: normal; font-variant-caps: normal;

              font-weight: normal; letter-spacing: normal; text-align:

              start; text-indent: 0px; text-transform: none;

              white-space: normal; word-spacing: 0px;

              -webkit-text-stroke-width: 0px; float: none; display:

              inline !important;" class="">On 2018-01-25 1:58 PM, Greg

              Clayton wrote:</span><br style="font-family:

              Menlo-Regular; font-size: 12px; font-style: normal;

              font-variant-caps: normal; font-weight: normal;

              letter-spacing: normal; text-align: start; text-indent:

              0px; text-transform: none; white-space: normal;

              word-spacing: 0px; -webkit-text-stroke-width: 0px;"

              class="">

            <blockquote type="cite" style="font-family: Menlo-Regular;

              font-size: 12px; font-style: normal; font-variant-caps:

              normal; font-weight: normal; letter-spacing: normal;

              orphans: auto; text-align: start; text-indent: 0px;

              text-transform: none; white-space: normal; widows: auto;

              word-spacing: 0px; -webkit-text-size-adjust: auto;

              -webkit-text-stroke-width: 0px;" class="">

              <blockquote type="cite" class="">On Jan 25, 2018, at 10:25

                AM, Erik Pilkington <<a

                  href="mailto:erik.pilkington@gmail.com" class=""

                  moz-do-not-send="true">erik.pilkington@gmail.com</a>>

                wrote:<br class="">

                <br class="">

                Hi,<br class="">

                I'm not at all familiar with LLDB, but I've been doing

                some work on the demangler in libcxxabi. It's still a

                work in progress and I haven't yet copied the changes

                over to ItaniumDemangle, which AFAIK is what lldb uses.

                The demangler in libcxxabi now demangles the symbol you

                attached in 3.31 seconds, instead of 223.54 on my

                machine. I posted a RFC on my work here (<a

                  href="http://lists.llvm.org/pipermail/llvm-dev/2017-June/114448.html"

                  class="" moz-do-not-send="true">http://lists.llvm.org/pipermail/llvm-dev/2017-June/114448.html</a>),

                but basically the new demangler just produces an AST

                then traverses it to print the demangled name.<br

                  class="">

              </blockquote>

              Great to hear the huge speedup in demangling! LLDB

              actually has two demanglers: a fast one that can demangle

              99% of names, and we fall back to ItaniumDemangle which

              can do all names but is really slow. It would be fun to

              compare your new demangler with the fast one and see if we

              can get rid of the fast demangler now.<br class="">

              <blockquote type="cite" class=""><br class="">

                I think a good way of making this even faster is to have

                LLDB consume the AST the demangler produces directly.

                The AST is a better representation of the information

                that LLDB wants, and finishing the demangle and then

                fishing out that information from the output string is

                unfortunate. From the AST, it would be really

                straightforward to just individually print all the

                components of the name that LLDB wants.<br class="">

              </blockquote>

              This would help us to grab the important bits out of the

              mangled name as well. We chop up a demangled name to find

              the base name (string for std::string), containing context

              (std:: for std::string) and we check if we can tell if the

              function is a method (look for trailing "const" modifier

              on the function) versus a top level function (since the

              mangling doesn't fully specify what is a namespace and

              what is a class (like in "foo::bar::baz()" we don't know

              if "foo" or "bar" are classes or namespaces. So the AST

              would be great as long as it is fast.<br class="">

              <br class="">

              <blockquote type="cite" class="">Most of the time it takes

                to demangle these "symbols from hell" is during the

                printing, after the AST has been parsed, because the

                demangler has to flatten out all the potentially nested

                back references. Just parsing to an AST should be about

                proportional to the strlen of the mangled name. Since

                (AFAIK) LLDB doesn't use some sections of the demangled

                name often (such as parameters), from the AST LLDB could

                lazily decide not to even bother fully demangling some

                sections of the name, then if it ever needs them it

                could parse a new AST and get them from there. I think

                this would largely fix the issue, as most of the time

                these crazy expansions don't occur in the name itself,

                but in the parameters or return type. Even when they do

                appear in the name, it would be possible to do some

                simple name classification (ie, does this symbol refer

                to a function) or pull out the basename quickly without

                expanding anything at all.<br class="">

                <br class="">

                Any thoughts? I'm really not at all familiar with LLDB,

                so I could have this all wrong!<br class="">

              </blockquote>

              AST sounds great. We can put this into the class we use to

              chop us C++ names as that is really our goal.<br class="">

              <br class="">

              So it would be great to do a speed comparison between our

              fast demangler in LLDB (in FastDemangle.cpp/.h) and your

              updated libcxxabi version. If yours is faster, remove

              FastDemangle and then update the llvm::ItaniumDemangle()

              to use your new code.<br class="">

              <br class="">

              ASTs would be great for the C++ name parser,<br class="">

              <br class="">

              Let us know what you are thinking,<br class="">

            </blockquote>

            <br style="font-family: Menlo-Regular; font-size: 12px;

              font-style: normal; font-variant-caps: normal;

              font-weight: normal; letter-spacing: normal; text-align:

              start; text-indent: 0px; text-transform: none;

              white-space: normal; word-spacing: 0px;

              -webkit-text-stroke-width: 0px;" class="">

            <span style="font-family: Menlo-Regular; font-size: 12px;

              font-style: normal; font-variant-caps: normal;

              font-weight: normal; letter-spacing: normal; text-align:

              start; text-indent: 0px; text-transform: none;

              white-space: normal; word-spacing: 0px;

              -webkit-text-stroke-width: 0px; float: none; display:

              inline !important;" class="">Hi Greg,</span><br

              style="font-family: Menlo-Regular; font-size: 12px;

              font-style: normal; font-variant-caps: normal;

              font-weight: normal; letter-spacing: normal; text-align:

              start; text-indent: 0px; text-transform: none;

              white-space: normal; word-spacing: 0px;

              -webkit-text-stroke-width: 0px;" class="">

            <br style="font-family: Menlo-Regular; font-size: 12px;

              font-style: normal; font-variant-caps: normal;

              font-weight: normal; letter-spacing: normal; text-align:

              start; text-indent: 0px; text-transform: none;

              white-space: normal; word-spacing: 0px;

              -webkit-text-stroke-width: 0px;" class="">

            <span style="font-family: Menlo-Regular; font-size: 12px;

              font-style: normal; font-variant-caps: normal;

              font-weight: normal; letter-spacing: normal; text-align:

              start; text-indent: 0px; text-transform: none;

              white-space: normal; word-spacing: 0px;

              -webkit-text-stroke-width: 0px; float: none; display:

              inline !important;" class="">I'll almost finished with my

              work on the demangler, hopefully I'll be done within a few

              weeks. Once that's all finished I'll look into exporting

              the AST and comparing it to FastDemangle. I was thinking

              about adding a version of llvm::itaniumMangle() that

              returns a opaque handle to the AST and defining some

              functions on the LLVM side that take that handle and

              return some extra information. I'd be happy to help out

              with the LLDB side of things too, although it might be

              better if someone more experienced with LLDB did this.</span><br

              style="font-family: Menlo-Regular; font-size: 12px;

              font-style: normal; font-variant-caps: normal;

              font-weight: normal; letter-spacing: normal; text-align:

              start; text-indent: 0px; text-transform: none;

              white-space: normal; word-spacing: 0px;

              -webkit-text-stroke-width: 0px;" class="">

            <br style="font-family: Menlo-Regular; font-size: 12px;

              font-style: normal; font-variant-caps: normal;

              font-weight: normal; letter-spacing: normal; text-align:

              start; text-indent: 0px; text-transform: none;

              white-space: normal; word-spacing: 0px;

              -webkit-text-stroke-width: 0px;" class="">

          </div>

        </blockquote>

        <div><br class="">

        </div>

        Can't wait! The only reason we switched away from the libcxxabi

        demangler in the first place was the poor performance. GDB's

        demangler was 3x faster. Our FastDemangler made got back to the

        speed of the GDB demangler. But it will be great to get back to

        one fast demangler.�</div>

      <div><br class="">

      </div>

      <div>It would be great if there was some way to implement the

        demangled name size cutoff in the demangler where if the

        detangled names goes over some max size we can just stop

        demangling. No one needs to see a 72MB string, not would anyone

        ever type in that name.</div>

      <div><br class="">

      </div>

      <div>If you can get the new demangler features (AST + demangling)

        into�<span style="font-family: Menlo-Regular;" class="">llvm::itaniumMangle

          I will be happy to do the LLDB side of the work</span></div>

      <div><span style="font-family: Menlo-Regular;" class=""><br

            class="">

        </span></div>

      <div>

        <blockquote type="cite" class="">

          <div class=""><span style="font-family: Menlo-Regular;

              font-size: 12px; font-style: normal; font-variant-caps:

              normal; font-weight: normal; letter-spacing: normal;

              text-align: start; text-indent: 0px; text-transform: none;

              white-space: normal; word-spacing: 0px;

              -webkit-text-stroke-width: 0px; float: none; display:

              inline !important;" class="">I'll ping this thread when

              I'm finished with the demangler, then we can hopefully

              work out what a good API for LLDB would be.</span><br

              style="font-family: Menlo-Regular; font-size: 12px;

              font-style: normal; font-variant-caps: normal;

              font-weight: normal; letter-spacing: normal; text-align:

              start; text-indent: 0px; text-transform: none;

              white-space: normal; word-spacing: 0px;

              -webkit-text-stroke-width: 0px;" class="">

          </div>

        </blockquote>

        <div><br class="">

        </div>

        It would be great to put all the functionality into LLVM and

        test the functionality in llvm tests. Then I will port over to

        LLDB as needed. As Jim said, we want to know the function

        basename, if a function is a C++ method or just a top level

        function or possibly both (we often don't know just from

        mangling if foo::bar() is a method of function since we don't

        know if "foo" is a namespace, but if we have "foo::bar() const",

        then we know it is a method.</div>

      <div><br class="">

      </div>

      <div>Look forward to seeing what you come up with!</div>

      <div><br class="">

      </div>

      <div>Greg</div>

      <div><br class="">

      </div>

      <div>

        <blockquote type="cite" class="">

          <div class=""><br style="font-family: Menlo-Regular;

              font-size: 12px; font-style: normal; font-variant-caps:

              normal; font-weight: normal; letter-spacing: normal;

              text-align: start; text-indent: 0px; text-transform: none;

              white-space: normal; word-spacing: 0px;

              -webkit-text-stroke-width: 0px;" class="">

            <span style="font-family: Menlo-Regular; font-size: 12px;

              font-style: normal; font-variant-caps: normal;

              font-weight: normal; letter-spacing: normal; text-align:

              start; text-indent: 0px; text-transform: none;

              white-space: normal; word-spacing: 0px;

              -webkit-text-stroke-width: 0px; float: none; display:

              inline !important;" class="">Thanks,</span><br

              style="font-family: Menlo-Regular; font-size: 12px;

              font-style: normal; font-variant-caps: normal;

              font-weight: normal; letter-spacing: normal; text-align:

              start; text-indent: 0px; text-transform: none;

              white-space: normal; word-spacing: 0px;

              -webkit-text-stroke-width: 0px;" class="">

            <span style="font-family: Menlo-Regular; font-size: 12px;

              font-style: normal; font-variant-caps: normal;

              font-weight: normal; letter-spacing: normal; text-align:

              start; text-indent: 0px; text-transform: none;

              white-space: normal; word-spacing: 0px;

              -webkit-text-stroke-width: 0px; float: none; display:

              inline !important;" class="">Erik</span><br

              style="font-family: Menlo-Regular; font-size: 12px;

              font-style: normal; font-variant-caps: normal;

              font-weight: normal; letter-spacing: normal; text-align:

              start; text-indent: 0px; text-transform: none;

              white-space: normal; word-spacing: 0px;

              -webkit-text-stroke-width: 0px;" class="">

            <br style="font-family: Menlo-Regular; font-size: 12px;

              font-style: normal; font-variant-caps: normal;

              font-weight: normal; letter-spacing: normal; text-align:

              start; text-indent: 0px; text-transform: none;

              white-space: normal; word-spacing: 0px;

              -webkit-text-stroke-width: 0px;" class="">

            <blockquote type="cite" style="font-family: Menlo-Regular;

              font-size: 12px; font-style: normal; font-variant-caps:

              normal; font-weight: normal; letter-spacing: normal;

              orphans: auto; text-align: start; text-indent: 0px;

              text-transform: none; white-space: normal; widows: auto;

              word-spacing: 0px; -webkit-text-size-adjust: auto;

              -webkit-text-stroke-width: 0px;" class="">Greg<br class="">

              <br class="">

              <blockquote type="cite" class="">Thanks,<br class="">

                Erik<br class="">

                <br class="">

                <br class="">

                On 2018-01-24 6:48 PM, Greg Clayton via lldb-dev wrote:<br

                  class="">

                <blockquote type="cite" class="">I have an issue where I

                  am debugging a C++ binary that is around 250MB in

                  size. It contains some mangled names that are crazy:<br

                    class="">

                  <br class="">

_ZNK3shk6detail17CallbackPublisherIZNS_5ThrowERKNSt15__exception_ptr13exception_ptrEEUlOT_E_E9SubscribeINS0_9ConcatMapINS0_18CallbackSubscriberIZNS_6GetAllIiNS1_IZZNS_9ConcatMapIZNS_6ConcatIJNS1_IZZNS_3MapIZZNS_7IfEmptyIS9_EEDaS7_ENKUlS6_E_clINS1_IZZNS_4TakeIiEESI_S7_ENKUlS6_E_clINS1_IZZNS_6FilterIZNS_9ElementAtEmEUlS7_E_EESI_S7_ENKUlS6_E_clINS1_IZZNSL_ImEESI_S7_ENKUlS6_E_clINS1_IZNS_4FromINS0_22InfiniteRangeContainerIiEEEESI_S7_EUlS7_E_EEEESI_S6_EUlS7_E_EEEESI_S6_EUlS7_E_EEEESI_S6_EUlS7_E_EEEESI_S6_EUlS7_E_EESI_S7_ENKUlS6_E_clIS14_EESI_S6_EUlS7_E_EERNS1_IZZNSH_IS9_EESI_S7_ENKSK_IS14_EESI_S6_EUlS7_E0_EEEEESI_DpOT_EUlS7_E_EESI_S7_ENKUlS6_E_clINS1_IZNS_5StartIJZNS_4JustIJS19_S1C_EEESI_S1F_EUlvE_ZNS1K_IJS19_S1C_EEESI_S1F_EUlvE0_EEESI_S1F_EUlS7_E_EEEESI_S6_EUlS7_E_EEEESt6vectorIS6_SaIS6_EERKT0_NS_12ElementCountEbEUlS7_E_ZNSD_IiS1Q_EES1T_S1W_S1X_bEUlOS3_E_ZNSD_IiS1Q_EES1T_S1W_S1X_bEUlvE_EES1G_S1O_E25ConcatMapValuesSubscriberEEEDaS7_<br

                    class="">

                  <br class="">

                  This de-mangles to something that is 72MB in size and

                  takes 280 seconds (try running "time c++filt -n" on

                  the above string).<br class="">

                  <br class="">

                  There are probably many symbols likes this in this

                  binary. Currently lldb will de-mangle all names in the

                  symbol table so that we can chop up the names so we

                  know function base names and we might be able to

                  classify a base name as a method or function for

                  breakpoint categorization.<br class="">

                  <br class="">

                  My questions is: how do we work around such issues in

                  LLDB? A few solutions I can think of:<br class="">

                  1 - time each name demangle and if it takes too long

                  somehow stop de-mangling similar symbols or symbols

                  over a certain length?<br class="">

                  2 - allow a setting that says "don't de-mangle names

                  that start with..." and the setting has a list of

                  prefixes.<br class="">

                  3 - have a setting that turns off de-mangling symbols

                  over a certain length all of the time with a default

                  of something like 256 or 512<br class="">

                  4 - modify our FastDemangler to abort if the

                  de-mangled string goes over a certain limit to avoid

                  bad cases like this...<br class="">

                  <br class="">

                  #1 would still mean we get a huge delay (like 280

                  seconds) when starting to debug this binary, but might

                  prevent multiple symbols from adding to that delay...<br

                    class="">

                  <br class="">

                  #2 would require debugging debugging once and then

                  knowing which symbols took a while to de-mangle. If we

                  time each de-mangle, we can warn that there are large

                  mangled names and print the mangled name so the user

                  might know?<br class="">

                  <br class="">

                  #3 would disable de-mangling of long names at the risk

                  of not de-mangling names that are close to the limit<br

                    class="">

                  <br class="">

                  #4 requires that our FastDemangle code can decode the

                  string mangled string. The fast de-mangler currently

                  aborts on tricky de-mangling and we fall back onto

                  cxa_demangle from the C++ library which doesn't not

                  have a cutoff on length...<br class="">

                  <br class="">

                  Can anyone else think of any other solutions?<br

                    class="">

                  <br class="">

                  Greg Clayton<br class="">

                  <br class="">

                  <br class="">

                  <br class="">

                  <br class="">

                  <br class="">

                  <br class="">

                  _______________________________________________<br

                    class="">

                  lldb-dev mailing list<br class="">

                  <a href="mailto:lldb-dev@lists.llvm.org" class=""

                    moz-do-not-send="true">lldb-dev@lists.llvm.org</a><br

                    class="">

<a class="moz-txt-link-freetext" href="http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev">http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev</a></blockquote>

              </blockquote>

            </blockquote>

          </div>

        </blockquote>

      </div>

      <br class="">

    </blockquote>

    <br>

  </body>

</html>