[cfe-dev] Recursive Descent Parser

Nico Weber thakis at chromium.org
Wed Sep 24 22:09:48 PDT 2014


If you only care about C: Long ago, clang's Parser talked to an abstract
"Action" interface, and Sema was only one possible implementation of it.
There used to be also a ParserPrintActions that could be requested via
-parse-print-callbacks. This got deleted in
http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20100719/032534.html
, so if you check out anything older than r109391 you can play with that.
It might do what you want.

C++ made it necessary to get rid of this separation,
> http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Sema/Action.h?view=log&pathrev=112243
> shows how the Action class slowly whittled away, finally getting deleted
> in r112244 with a peculiar commit message.


On Wed, Sep 24, 2014 at 8:25 PM, Raghavan <raghavan97 at yahoo.co.in> wrote:

> Hi,
>
> I just wanted to see how the recursive descent parser functions get
> activated. This is more for showing students how a recursive descent parser
> works in a production compiler rather than checking correctness.
>
> I had done a similar thing for the 'cc1' in gcc by compiling 'cc1' using
> the '-finstrument-functions' option. This gives me control each time a
> function gets called or exits. Using that I was able to generate something
> like this.
>
>
>
> # The input C source file
> $ cat -n test3.c
>      1  int var1,var2;
>
> # The output from my instrumented 'cc1' compiler when I compile the above
> file.
>
> { enter c_parser_translation_unit
>    { enter c_parser_external_declaration
>       { enter c_parser_declaration_or_fndef
>          { enter c_parser_declspecs
>             { enter c_parser_consume_token
>                Token No:1 Lexeme:'int' Type:CPP_NAME
>             } exit c_parser_consume_token
>          } exit c_parser_declspecs
>          { enter c_parser_declarator
>             { enter c_parser_direct_declarator
>                { enter c_parser_consume_token
>                   Token No:2 Lexeme:'var1' Type:CPP_NAME
>                } exit c_parser_consume_token
>                { enter c_parser_direct_declarator_inner
>                } exit c_parser_direct_declarator_inner
>             } exit c_parser_direct_declarator
>          } exit c_parser_declarator
>          { enter c_parser_consume_token
>             Token No:3 Lexeme:',' Type:CPP_COMMA
>          } exit c_parser_consume_token
>          { enter c_parser_declarator
>             { enter c_parser_direct_declarator
>                { enter c_parser_consume_token
>                   Token No:4 Lexeme:'var2' Type:CPP_NAME
>                } exit c_parser_consume_token
>                { enter c_parser_direct_declarator_inner
>                } exit c_parser_direct_declarator_inner
>             } exit c_parser_direct_declarator
>          } exit c_parser_declarator
>          { enter c_parser_consume_token
>             Token No:5 Lexeme:';' Type:CPP_SEMICOLO
>          } exit c_parser_consume_token
>       } exit c_parser_declaration_or_fndef
>    } exit c_parser_external_declaration
> } exit c_parser_translation_unit
>
> It maps closely with the following productions from the standard at
> http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf. The section
> numbers given in the Table 3 1 correspond to the sections in the same
> document.
>
>
>
> translation-unit        :       external-declaration    6.9.1
> external-declaration    :       declaration     6.9.1
> external-declaration    :       function-definition     6.9.1
> declaration     :       declaration-specifiers init-declarator-listopt ;
>       6.7.1
> declaration-specifiers  :       type-specifier declaration-specifiersopt
>       6.7.1
> type-specifier  :       int     6.7.2
> init-declarator-list    :       init-declarator 6.7.1
>         :       init-declarator-list , init-declarator  6.7.1
> init-declarator         :       declarator      6.7.1
> declarator                      :       pointeropt direct-declarator
> 6.7.6
> direct-declarator       :       identifier      6.7.6
>
>
>
> Thanks.
>
> Bye,
> Raghavan V
>
>
>
> From: Nikola Smiljanic [mailto:popizdeh at gmail.com]
> Sent: Thursday, September 25, 2014 3:33 AM
> To: Daniel Dilts
> Cc: Raghavan; cfe-dev Developers
> Subject: Re: [cfe-dev] Recursive Descent Parser
>
> I'm not sure what Raghvan is trying to achieve but I don't think the graph
> would be very helpful. Grammar productions are available in Annex A of the
> standard (if that's all he's after) but AFAIK clang's parser doesn't map
> 1:1 to them.
>
> On Thu, Sep 25, 2014 at 7:47 AM, Daniel Dilts <diltsman at gmail.com> wrote:
> I suppose that someone could write a utility using Clang to dump the call
> graph starting at the base rule.
>
>
> On Wed, Sep 24, 2014 at 2:31 PM, Nikola Smiljanic <popizdeh at gmail.com>
> wrote:
> I don't think there is. Have a look at clang::ParseAST. it keeps track of
> the callstack in case of crash. It also keeps track of some statistics but
> I think that's all there is.
>
> On Sun, Sep 21, 2014 at 9:55 AM, Raghavan <raghavan97 at yahoo.co.in> wrote:
> Hi,
>
> I am a newbie to clang and LLVM.
>
> I realize that the clang uses Recursive descent parsing for compiling a c
> file.
>
> I was wondering, if there is any way to print the production rules or any
> kind of parser related details during the parsing of a C file by clang.
>
>
> Thanks.
>
> Bye,
> Raghavan V
>
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>
>
>
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20140924/5bb83ef6/attachment.html>


More information about the cfe-dev mailing list