[cfe-dev] No clang::Action no party
Simone Pellegrini
spellegrini at dps.uibk.ac.at
Wed Sep 22 13:41:53 PDT 2010
On 09/21/2010 03:50 PM, Douglas Gregor wrote:
> On Sep 21, 2010, at 6:16 AM, Simone Pellegrini wrote:
>> I actually do not agree with this statement. Clang actually does a very poor job (no offense) when it comes to pragmas. I actually spent some time in designing a framework to handle pragmas on top of clang and there are a couple of core changes which I need to do and for that I had to patch clang.
>>
>> For example in OpenMP the standard allows C expressions to be written as part of the pragmas. For example someone can write:
>>
>> #pragma omp parallel omp_threads(3*2+1)
>>
>> For anyone interested in implement the full OpenMP standard this would require to re-implement a parser for C expressions (among other things), and this seems a duplication of work since the Clang parser already does it pretty well. In order to do that I had to make my pragma handler class friend of the clang::Parser; in this way I can directly use the private ParseExpression() method.
> OpenMP is a somewhat extreme example, because it *does* involve full expressions and many deep tie-ins with the AST, ultimately affecting IR generation as well. I think it would be great if the pragma interface could be extended to handle OpenMP, because that means that many other pragma handlers would also be possible.
In order to make pragma processing more useful in clang there are
actually two main aspects which need to be improved.
The first one is give to the user a way to specify new pragmas without
reinventing the wheel all the times; and also offering him the
capabilities to call directly the Clang parser to parse complex
expressions without having these poor guys messing around with low-level
implementation details of the lexer/parser. To solve this aspect we
develop a mechanism which allows new pragma to be defined in a way
Boost::Spirit does. We took some of the concepts but we write from
scratch a parser generator which works very close with clang's lexer
(and parser).
The main idea is the following, if you want to define a new pragma,
let's say:
#pragma mypragma ((awesomeness = (yes | no)) | (digit (',' digit)*))
for example you want to write in your code stuff like this:
#pragma mypragma awesomeness = yes
or
#pragma mypragma 2,3,4
What we do is let the user specify the grammar in a declarative way buy
letting him building up a parsing tree simply buy concatenating expressions:
auto matcher = (
kwd("awesomeness") >> equal >> l_paren >> (
kwd("yes") | kwd("no")) >> r_paren
|
(numeric_constant >> *(comma >>
numeric_constant) )
) >> eom
this object will be passed to a pragma handler, when the parser calls
the handler the Preprocessor will be passed to the object which by
consuming tokens will try to match the rule. Things can get more
complicated that this, you have several other operators (!,+,~). If the
pragma is matched the object will create a map for your where you can
find all the parsed information in form of strings or Clang AST nodes.
The second part of the problem is the association of pragmas to AST
nodes (nodes or definitions). We solved the problem by having the pragma
handler calling a method we added to the (old) Action interface:
template <PragmaTy>
ActOnPragma(SourceLocation start, SourceLocation end, MatcherMap mmap);
When this method is called Sema will create an object of type PragmaTy
and store it internally in a list of pending pragmas, i.e. pragma which
didn't find yet the correct placement. Here it becomes a tricky issue
related with the way Sema creates the ast nodes, in fact it's not always
true that the next statement that will be created by Sema is the one
that has to be attached to the more recent pending pragma. For example
in the following case:
#pragma omp parallel
{
int a = 0;
}
Sema will parse the pragma, then create a DeclStmt for int a... and at
the end it will build the CompoundStmt which is the one we want to
attach to the pragma.
In order to solve the problem we overloaded a couple of methods in Sema
(like ActOnCompoundStmt or ActOnForStmt...), what we do is basically
filtering the list of pending pragmas which are within the range of the
statement and start the matching of those pragmas inside the range.
Quite easy though.
The only tricky part is dealing with situation like the following:
{
int a;
#pragma omp barrier
}
where we modify the structure of the CompoundStmt by adding a NullStmt
so we can match the pragma with it.
{
int a;
#pragma omp barrier
;
}
The matching algorithm is not that complex actually, the only question
is that if it's computationally too expensive for satisfying clang
requirements. I think this was the best solution I could come up keeping
the minimum impact on Clang code base (it only requires to make 1 class
friend with the parser) and efficient.
cheers, Simone P.
>> So, for short, custom pragma handlers need low level access to the Parser! (or some methods in the Parser should be made public).
>>
>> Secondly when the pragma handler is called the attached statement has not yet being created by the parser so the association cannot be done by the handler but someone else has to take care of it (that's the reason why the Action interface was useful).
>> Either someone keeps a list of pragmas and do the matching at the end in the ASTConsumer, or for a more efficient solution which minimize the number of checks Sema should take care of it.
> I'd love to see a general mechanism for this in Sema; most of the time, people want to attach pragmas to statements/expressions/declarations, and having a general way to do that would be great.
>
>> Anyway, if there is an interest in making pragma handling more flexible and powerful in clang I could contribute with some ideas and code (of course).
> I think I better pragma handling mechanism, which makes it easier to tie in with the parser, would be a great benefit to Clang and to developers who want to extend Clang with new pragmas.
>
> - Doug
More information about the cfe-dev
mailing list