[cfe-dev] [lldb-dev] C++ method declaration parsing

Eugene Zemtsov via cfe-dev cfe-dev at lists.llvm.org
Wed Mar 15 19:42:39 PDT 2017


Yes, it's a good idea to add cfe-dev.
It is totally possible that I overlooked something and clang can help with
this kind of superficial parsing.

As far as I can see even clang-format does it's own parsing
(UnwrappedLineParser.cpp) and clang-format has very similar need of roughly
understanding of code without knowing any context.

> are you certain that clang's parser would be unacceptably slow?

I don't have any perf numbers to back it up, but it does look like a lot of
clang infrastructure needs to be set up before actual parsing begins. (see
lldb_private::ClangExpressionParser). It's not important though, as at this
stage I don't see how we can reuse clang at all.



On Wed, Mar 15, 2017 at 5:03 PM, Zachary Turner <zturner at google.com> wrote:

> If there is any way to re-use clang parser for this, it would be
> wonderful.  Even if it means adding support to clang for whatever you need
> in order to make it possible.  You mention performance, are you certain
> that clang's parser would be unacceptably slow?
>
> +cfe-dev as they may have some more input on what it would take to extend
> clang to make this possible.
>
> On Wed, Mar 15, 2017 at 4:48 PM Eugene Zemtsov via lldb-dev <
> lldb-dev at lists.llvm.org> wrote:
>
>> Hi, Everyone.
>>
>> Current implementation of CPlusPlusLanguage::MethodName::Parse() doesn't
>> cover full extent of possible function declarations,
>> or even declarations returned by abi::__cxa_demangle.
>>
>> Consider this code:
>> --------------------------------------------------
>>
>> #include <stdio.h>
>> #include <functional>
>> #include <vector>
>>
>> void func() {
>>   printf("func() was called\n");
>> }
>>
>> struct Class
>> {
>>   Class() {
>>     printf("ctor was called\n");
>>   }
>>
>>   Class(const Class& c) {
>>     printf("copy ctor was called\n");
>>   }
>>
>>   ~Class() {
>>     printf("dtor was called\n");
>>   }
>> };
>>
>>
>> int main() {
>>   std::function<void()> f = func;
>>   f();
>>
>>   Class c;
>>   std::vector<Class> v;
>>   v.push_back(c);
>>
>>   return 0;
>> }
>>
>> --------------------------------------------------
>>
>> When compiled It has at least two symbols that currently cannot be
>> correctly parsed by MethodName::Parse() .
>>
>> void std::vector<Class, std::allocator<Class> >::_M_emplace_back_aux<Class const&>(Class const&)
>> void (* const&std::_Any_data::_M_access<void (*)()>() const)() - a template function that returns a reference to a function pointer.
>>
>> It causes incorrect behavior in avoid-stepping and sometimes messes
>> printing of thread backtrace.
>>
>> I would like to solve this issue, but current implementation of method
>> name parsing doesn't seem sustainable.
>> Clever substrings and regexs are fine for trivial cases, but they become
>> a nightmare once we consider more complex cases.
>> That's why I'd like to have code that follows some kind of grammar
>> describing function declarations.
>>
>> As I see it, choices for new implementation of MethodName::Parse() are
>> 1. Reuse clang parsing code.
>> 2. Parser generated by bison.
>> 3. Handwritten recursive descent parser.
>>
>> I looked at the option #1, at it appears to be impossible to reuse clang
>> parser for this kind of zero-context parsing.
>> Especially given that we care about performance of this code. Clang C++
>> lexer on the other hand can be reused.
>>
>> Option #2. Using bison is tempting, but it would require introduction of
>> new compile time dependency.
>> That might be especially inconvenient on Windows.
>>
>> That's why I think option #3 is the way to go. Recursive descent parser
>> that reuses a C++ lexer from clang.
>>
>> LLDB doesn't need to parse everything (e.g. we don't care about details
>> of function arguments), but it needs to be able to handle tricky return
>> types and base names.
>> Eventually new implementation should be able to parse signature of every
>> method generated by STL.
>>
>> Before starting implementation, I'd love to get some feedback. It might
>> be that my overlooking something important.
>>
>> --
>> Thanks,
>> Eugene Zemtsov.
>> _______________________________________________
>> lldb-dev mailing list
>> lldb-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
>>
>


-- 
Thanks,
Eugene Zemtsov.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20170315/3dd5c755/attachment.html>


More information about the cfe-dev mailing list