[cfe-dev] No clang::Action no party

Wed Sep 22 13:41:53 PDT 2010

  On 09/21/2010 03:50 PM, Douglas Gregor wrote:
> On Sep 21, 2010, at 6:16 AM, Simone Pellegrini wrote:
>> I actually do not agree with this statement. Clang actually does a very poor job (no offense) when it comes to pragmas. I actually spent some time in designing a framework to handle pragmas on top of clang and there are a couple of core changes which I need to do and for that I had to patch clang.
>>
>> For example in OpenMP the standard allows C expressions to be written as part of the pragmas. For example someone can write:
>>
>> #pragma omp parallel omp_threads(3*2+1)
>>
>> For anyone interested in implement the full OpenMP standard this would require to re-implement a parser for C expressions (among other things), and this seems a duplication of work since the Clang parser already does it pretty well. In order to do that I had to make my pragma handler class friend of the clang::Parser; in this way I can directly use the private ParseExpression() method.
> OpenMP is a somewhat extreme example, because it *does* involve full expressions and many deep tie-ins with the AST, ultimately affecting IR generation as well. I think it would be great if the pragma interface could be extended to handle OpenMP, because that means that many other pragma handlers would also be possible.

In order to make pragma processing more useful in clang there are 
actually two main aspects which need to be improved.

The first one is give to the user a way to specify new pragmas without 
reinventing the wheel all the times; and also offering him the 
capabilities to call directly the Clang parser to parse complex 
expressions without having these poor guys messing around with low-level 
implementation details of the lexer/parser. To solve this aspect we 
develop a mechanism which allows new pragma to be defined in a way 
Boost::Spirit does. We took some of the concepts but we write from 
scratch a parser generator which works very close with clang's lexer 
(and parser).

The main idea is the following, if you want to define a new pragma, 
let's say:
#pragma mypragma ((awesomeness = (yes | no)) | (digit (',' digit)*))

for example you want to write in your code stuff like this:
#pragma mypragma awesomeness = yes
or
#pragma mypragma 2,3,4

What we do is let the user specify the grammar in a declarative way buy 
letting him building up a parsing tree simply buy concatenating expressions:
auto matcher = (
                             kwd("awesomeness") >> equal >> l_paren >> ( 
kwd("yes") | kwd("no")) >> r_paren
                           |
                             (numeric_constant >> *(comma >> 
numeric_constant) )
                            ) >> eom

this object will be passed to a pragma handler, when the parser calls 
the handler the Preprocessor will be passed to the object which by 
consuming tokens will try to match the rule. Things can get more 
complicated that this, you have several other operators (!,+,~). If the 
pragma is matched the object will create a map for your where you can 
find all the parsed information in form of strings or Clang AST nodes.

The second part of the problem is the association of pragmas to AST 
nodes (nodes or definitions). We solved the problem by having the pragma 
handler calling a method we added to the (old) Action interface:

template <PragmaTy>
ActOnPragma(SourceLocation start, SourceLocation end, MatcherMap mmap);

When this method is called Sema will create an object of type PragmaTy 
and store it internally in a list of pending pragmas, i.e. pragma which 
didn't find yet the correct placement. Here it becomes a tricky issue 
related with the way Sema creates the ast nodes, in fact it's not always 
true that the next statement that will be created by Sema is the one 
that has to be attached to the more recent pending pragma. For example 
in the following case:

#pragma omp parallel
{
     int a = 0;
}

Sema will parse the pragma, then create a DeclStmt for int a... and at 
the end it will build the CompoundStmt which is the one we want to 
attach to the pragma.
In order to solve the problem we overloaded a couple of methods in Sema 
(like ActOnCompoundStmt or ActOnForStmt...), what we do is basically 
filtering the list of pending pragmas which are within the range of the 
statement and start the matching of those pragmas inside the range. 
Quite easy though.

The only tricky part is dealing with situation like the following:

{
     int a;
     #pragma omp barrier
}

where we modify the structure of the CompoundStmt by adding a NullStmt 
so we can match the pragma with it.

{
     int a;
     #pragma omp barrier
     ;
}

The matching algorithm is not that complex actually, the only question 
is that if it's computationally too expensive for satisfying clang 
requirements. I think this was the best solution I could come up keeping 
the minimum impact on Clang code base (it only requires to make 1 class 
friend with the parser) and efficient.

cheers, Simone P.
>> So, for short, custom pragma handlers need low level access to the Parser! (or some methods in the Parser should be made public).
>>
>> Secondly when the pragma handler is called the attached statement has not yet being created by the parser so the association cannot be done by the handler but someone else has to take care of it (that's the reason why the Action interface was useful).
>> Either someone keeps a list of pragmas and do the matching at the end in the ASTConsumer, or for a more efficient solution which minimize the number of checks Sema should take care of it.
> I'd love to see a general mechanism for this in Sema; most of the time, people want to attach pragmas to statements/expressions/declarations, and having a general way to do that would be great.
>
>> Anyway, if there is an interest in making pragma handling more flexible and powerful in clang I could contribute with some ideas and code (of course).
> I think I better pragma handling mechanism, which makes it easier to tie in with the parser, would be a great benefit to Clang and to developers who want to extend Clang with new pragmas.
>
> 	- Doug