Implict casts disappeared from syntactic init list expressions in C++

Richard Smith via cfe-commits cfe-commits at lists.llvm.org
Thu Oct 8 14:36:01 PDT 2015


There are some other open problems in this area:

- RecursiveASTVisitor on nested InitListExprs is currently worst-case
exponential time because it walks the syntactic and semantic forms
separately
- Tools such as "find all references to this function" need the semantic
form of every initializer, whether or not that initializer is actually used
for the initialization (it might be overridden through the use of a
designator)
- ...

Having thought about this for a while, I think the right answer is this:

The only difference between the syntactic and semantic forms of an
InitListExpr should be designated initializers and brace elision. In all
other respects, the syntactic and semantic forms should be identical -- in
particular, both should contain the results of performing the relevant
initialization sequences on the list elements, and we should extend the
syntactic form to include the implicit initializations for trailing
elements.

With that in hand, we should make RecursiveASTVisitor visit /only/ the
syntactic form. The semantic ("simplified") form should only be used in
places where we want to know the semantic effect of the initialization
(after applying the designated initialization overriding rules and
inserting the elided braces), and is always derivable in a fairly
straightforward fashion from the syntactic form.

I think that's largely what you were suggesting below. Do you agree?

On Wed, Oct 7, 2015 at 11:54 PM, Abramo Bagnara <abramo.bagnara at bugseng.com>
wrote:

> Ping^2
>
> Il 12/09/2015 09:40, Abramo Bagnara ha scritto:
> > Ping...
> >
> > Il 29/08/2015 10:01, Abramo Bagnara ha scritto:
> >> Il 28/08/2015 23:27, Richard Smith ha scritto:
> >>> On Tue, Aug 25, 2015 at 10:27 AM, Abramo Bagnara
> >>> <abramo.bagnara at bugseng.com <mailto:abramo.bagnara at bugseng.com>>
> wrote:
> >>>
> >>>     Comparing the result of InitListExpr::getSyntacticForm between
> r224986
> >>>     and r245836 I've discovered that integer to char implicit cast for
> >>>     integer literal 3 is no longer added to AST for C++ (while it is
> present
> >>>     in C).
> >>>
> >>>     This is the source used to test:
> >>>
> >>>     char v[10] = { 3 };
> >>>
> >>>     Taken in account that:
> >>>
> >>>     - implicit cast (and other conversions, constructor calls, etc.)
> are
> >>>     very important also for who need to visit the syntactic form
> (obvious in
> >>>     *both* C and C++)
> >>>
> >>>     - to generate that for the syntactic form permit to increase the
> >>>     efficiency and the sharing when using designated range extensions
> (as
> >>>     the conversion chain don't need to be replicated for each entry)
> >>>
> >>>     I think it is a regression. Am I missing something?
> >>>
> >>>
> >>> Why do you expect this semantic information to appear in the syntactic
> >>> form of the initializer?
> >>
> >> Compare:
> >>
> >> int x = 2.0;
> >>
> >> with
> >>
> >> struct s {
> >>   int x;
> >> } v = { .x = 2.0 };
> >>
> >> For first declaration I have non-syntactic nodes (namely
> >> ImplicitCastExpr) along with syntactic nodes, while for the second I
> >> don't have that (for C++). This is an obstacle to write semi-syntactic
> >> checkers that aims to find e.g. implicit cast from double to int in its
> >> syntactic context.
> >> Note that although we might visit the semantic form, we'll lose the
> >> designators (not present in semantic form).
> >>
> >> To resume, the reason why I would expect that are:
> >>
> >> 1) this is how it always has worked for C (and fortunately still works
> >> this way)
> >>
> >>
> >> 2) this is how it always has worked (although partially, there was some
> >> bugs) for C++. In past we have had patches to fix the areas where this
> >> invariant was not respected (see commit 3146766
> >>
> http://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20111212/050339.html
> >> as an example).
> >>
> >> This behavior has changed rather recently (if you think it is useful I
> >> can find the commit that has removed the implicit casts from syntactic
> >> form in C++)
> >>
> >> Before such commit(s) the only bug I was aware where AST was missing
> >> conversion chain was the following:
> >>
> >> struct R2 {
> >>   R2(int) {
> >>   }
> >> };
> >> R2 v2[] = { 1.0 };
> >>
> >>
> >> 3) this way it would be congruent with other areas of AST where we have
> >> non-syntactic nodes along with syntactic ones
> >>
> >>
> >> 4) it would permit to share more nodes in semantic form (and avoid to
> >> rebuild many times the same conversion chain).
> >>
> >> Looking at following typescript you can observe that ImplicitCastExpr is
> >> shared only for C, but not for C++. I've initialized only two entries,
> >> but it might be 1000 or 10000000.
> >>
> >> $ cat p.c
> >> int x[1000000] = { [0 ... 1] = 3.0 };
> >> $ clang-3.8 -cc1 -ast-dump -x c p.c
> >> TranslationUnitDecl 0x272de40 <<invalid sloc>> <invalid sloc>
> >> |-TypedefDecl 0x272e338 <<invalid sloc>> <invalid sloc> implicit
> >> __int128_t '__int128'
> >> |-TypedefDecl 0x272e398 <<invalid sloc>> <invalid sloc> implicit
> >> __uint128_t 'unsigned __int128'
> >> |-TypedefDecl 0x272e648 <<invalid sloc>> <invalid sloc> implicit
> >> __builtin_va_list 'struct __va_list_tag [1]'
> >> `-VarDecl 0x272e718 <p.c:1:1, col:36> col:5 x 'int [1000000]' cinit
> >>   `-InitListExpr 0x272e8b0 <col:18, col:36> 'int [1000000]'
> >>     |-array filler
> >>     | `-ImplicitValueInitExpr 0x272e918 <<invalid sloc>> 'int'
> >>     |-ImplicitCastExpr 0x272e900 <col:32> 'int' <FloatingToIntegral>
> >>     | `-FloatingLiteral 0x272e7f8 <col:32> 'double' 3.000000e+00
> >>     `-ImplicitCastExpr 0x272e900 <col:32> 'int' <FloatingToIntegral>
> >>       `-FloatingLiteral 0x272e7f8 <col:32> 'double' 3.000000e+00
> >> $ clang-3.8 -cc1 -ast-dump -x c++ p.c
> >> TranslationUnitDecl 0x3300e60 <<invalid sloc>> <invalid sloc>
> >> |-TypedefDecl 0x3301398 <<invalid sloc>> <invalid sloc> implicit
> >> __int128_t '__int128'
> >> |-TypedefDecl 0x33013f8 <<invalid sloc>> <invalid sloc> implicit
> >> __uint128_t 'unsigned __int128'
> >> |-TypedefDecl 0x3301718 <<invalid sloc>> <invalid sloc> implicit
> >> __builtin_va_list 'struct __va_list_tag [1]'
> >> `-VarDecl 0x33017e8 <p.c:1:1, col:36> col:5 x 'int [1000000]' cinit
> >>   `-InitListExpr 0x3301980 <col:18, col:36> 'int [1000000]'
> >>     |-array filler
> >>     | `-ImplicitValueInitExpr 0x3301a00 <<invalid sloc>> 'int'
> >>     |-ImplicitCastExpr 0x33019d0 <col:32> 'int' <FloatingToIntegral>
> >>     | `-FloatingLiteral 0x33018c8 <col:32> 'double' 3.000000e+00
> >>     `-ImplicitCastExpr 0x33019e8 <col:32> 'int' <FloatingToIntegral>
> >>       `-FloatingLiteral 0x33018c8 <col:32> 'double' 3.000000e+00
> >>
> >>
> >> 5) if we would visit semantic form in a checker searching for implicit
> >> cast from float to int in the C source above we'll find *two* of them,
> >> while syntactically we have only one. This means that we should be
> >> forced do some dirty tricks to avoid double reporting.
> >>
> >
> >
>
>
> --
> Abramo Bagnara
>
> BUGSENG srl - http://bugseng.com
> mailto:abramo.bagnara at bugseng.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20151008/591d915b/attachment-0001.html>


More information about the cfe-commits mailing list