[cfe-dev] Quenstion about ParseAST()

Sun Aug 19 15:58:47 PDT 2012

> From the documentation I guess this is one case that clang follows the
> history of gcc. - compiler doesn't have to fold like this only
> emitting error would be enough but clang fold this because gcc is now
> de facto standard and most people expect a compiler compiles this kind
> of out of standard code.

As John also mentioned, this is actually required by the standard, and
is not a GCC extension. If you look again at the 3rd paragraph on
<http://clang.llvm.org/docs/InternalsManual.html#Constants>, it is not
that GCC evaluates these expressions which is the non-standard
behavior. The non-standard behavior is that it uses its own definition
for "integer constant expression" which accepts any expression which
it *can* fold to be an integer constant, even if that expression isn't
really an "integer constant expression" as defined in the Standard.

> I wonder if this case is a opposition case of your explanation - "but
> it doesn't really "fold" them".

Sorry for not being clear. Clang *does* compute the numerical value of
such expressions. However, unlike GCC, Clang doesn't "forget about"
the original expression. To expand a bit on John's rather terse
description, the types of

char a[4 + 4];
char b[8];

are both a ConstantArrayType
<http://clang.llvm.org/doxygen/classclang_1_1ConstantArrayType.html>
whose getSize() method returns 8. However, Clang maintains enough
information in its AST to also give you an ArrayTypeLoc
<http://clang.llvm.org/doxygen/classclang_1_1ArrayTypeLoc.html>, which
has a getSizeExpr() method which will give you the original
expression.

One example of where you could use this is that if you have two
DeclaratorDecl*'s A and B corresponding to the declarations of the
char arrays `a` and `b` above, then

    A->getTypeSourceInfo()->getType() == B->getTypeSourceInfo()->getType()

would be true, but

    cast<ArrayTypeLoc>(A->getTypeSourceInfo()->getTypeLoc())->getSizeExpr()

would _not_ be the same Expr* as

    cast<ArrayTypeLoc>(B->getTypeSourceInfo()->getTypeLoc())->getSizeExpr()

--Sean Silva

On Sun, Aug 19, 2012 at 5:42 PM, Journeyer J. Joh
<oosaprogrammer at gmail.com> wrote:
> Hello Sean Silva,
>
>> I believe this is primarily a locality optimization so that codegen
>> operates on the AST while it is still hot on cache; you could just as
>> easily perform all codegen based on the final AST.
>
> I think this is very much valuable explanation. Thank you.
>
>> Clang does not do constant folding; it's AST exactly represents the
>> expression. Clang has facilities for evaluating constant expressions
>> when necessary, but it doesn't really "fold" them.
>
> When I tested a declaration below, seems to be folded in AST.
>
> char y[2 + sizeof(struct dummy)];
>
> It is placed in AST as shown below.
>
> "char y[6]"
>
> And there is a documentation about constant folding.
>
> http://clang.llvm.org/docs/InternalsManual.html#Constants
>
> From the documentation I guess this is one case that clang follows the
> history of gcc. - compiler doesn't have to fold like this only
> emitting error would be enough but clang fold this because gcc is now
> de facto standard and most people expect a compiler compiles this kind
> of out of standard code.
>
> I wonder if this case is a opposition case of your explanation - "but
> it doesn't really "fold" them".
>
> Thank you very much for your kind concern!
>
> Have a good day.
>
> Journeyer
>
>
> 2012/8/20 Sean Silva <silvas at purdue.edu>:
>>> In my investigation, 2,3 loops for each top level declarations.
>>
>> I think you're getting confused by a lot of the other stuff that is
>> happening in ParseAST that isn't really all that important. ParseAST
>> is basically 3 steps:
>>
>> 1. prepare for parsing
>> 2. repeatedly call P.ParseTopLevelDecl() until the whole source file
>> has been parsed
>> 3. do other stuff that has to happen after parsing
>>
>>> Expecially I wonder why CodeGenAction works before the complete AST is built.
>>
>> Clang's AST is immutable. Once the top-level decl is finished, it
>> never changes, hence it is safe to pass to codegen at that point.
>>
>> I believe this is primarily a locality optimization so that codegen
>> operates on the AST while it is still hot on cache; you could just as
>> easily perform all codegen based on the final AST.
>>
>>> And Constants folding occurs in 3.
>>
>> Clang does not do constant folding; it's AST exactly represents the
>> expression. Clang has facilities for evaluating constant expressions
>> when necessary, but it doesn't really "fold" them.
>>
>> -- Sean Silva
>>
>> On Sun, Aug 19, 2012 at 1:42 AM, Journeyer J. Joh
>> <oosaprogrammer at gmail.com> wrote:
>>> Hello list,
>>>
>>> http://www.opencpp.kr/ParseAST.jpg
>>>
>>> I made a diagram about the global function ParseAST() in ParseAST.cpp
>>>
>>> Could someone explain about the 1,2,3,4 above?
>>>
>>> In my investigation, 2,3 loops for each top level declarations.
>>> And the AST Parse Tree is completed before number 4 starts.
>>> And Constants folding occurs in 3.
>>>
>>> Expecially I wonder why CodeGenAction works before the complete AST is built.
>>>
>>> Thank you very much in advance.
>>>
>>> Journeyer J. Joh
>>>
>>>
>>>
>>>
>>> --
>>> ----------------------------------------
>>> Journeyer J. Joh
>>> o o s a p r o g r a m m e r
>>> a t
>>> g m a i l  d o t  c o m
>>> ----------------------------------------
>>> _______________________________________________
>>> cfe-dev mailing list
>>> cfe-dev at cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>
>
>
> --
> ----------------------------------------
> Journeyer J. Joh
> o o s a p r o g r a m m e r
> a t
> g m a i l  d o t  c o m
> ----------------------------------------