[cfe-dev] fix for Clang PR 8419

Zhanyong Wan (λx.x x) wan at google.com
Sat Nov 20 00:51:23 PST 2010


Hi Ted,

I've implemented the handling of unary operators as you and Zhongxing
suggested, and uploaded a new patch to
http://codereview.appspot.com/2920041/ -- all tests are passing.

Note:

1. I tried to move the test cases to misc-ps.m as you suggested, but
that resulted in errors like 'class' is not a valid type name.  I
don't know Objective-C and Objective-C++, so I have no idea how to fix
that.  Therefore I'm leaving the test cases in
misc-ps-region-store.cpp.

2. I also made some cosmetic changes to VisitBinaryOperator() to make
it conform to the LLVM coding standards ("no 'else' after a
'return'").  I can take them out or move them to a new patch if you
prefer.

Would you please review this?  Thanks!

On Fri, Nov 19, 2010 at 11:35 PM, Zhongxing Xu <xuzhongxing at gmail.com> wrote:
> While fix the lvalue-rvalue conversion issue in the AST seems to bring
> benefits in the long run, I suggest we fix these cases one by one in the CFG
> construction, until we see enough cases that warrants an overhaul.
>
> In this case, we can only mark the block-level expr in the unary operator as
> StatementAsLValue. This is a simple and local change.
>
> 2010/11/20 Zhanyong Wan (λx.x x) <wan at google.com>
>>
>> Thanks for thinking through this and sharing your analysis, Ted!  I'm
>> all for fixing this in a principled way.  However, the crash I
>> reported is blocking some experiments I'm doing.  Therefore I'd really
>> like to put in a fix for the symptom first to unblock me, and then we
>> can take the time necessary to fix the whole thing the right way.
>> Would you oppose to that?  Thanks,
>>
>> On Fri, Nov 19, 2010 at 11:07 PM, Ted Kremenek <kremenek at apple.com> wrote:
>> > On Nov 19, 2010, at 5:24 PM, Zhanyong Wan (λx.x x) wrote:
>> >
>> >> I'll document this in my new patch.  Also, how would you suggest to
>> >> add a test to catch violation of #2?
>> >
>> > Stepping back, I think we may need some extra design here, either in the
>> > ASTs or in the analyzer, to model lvalue-to-rvalue conversions correctly for
>> > C++.
>> >
>> > With C, lvalue-to-rvalue conversions happen in specific cases, e.g.:
>> >
>> > - Loading from a variable
>> >
>> > - Dereferencing a pointer
>> >
>> > Semantically such conversions result in a "load" of a value from a
>> > memory address.  This is what 'EvalLoad()' does in GRExprEngine.  Because
>> > the places where such conversions occur in C are fairly limited, up to now
>> > we have been able to model lvalue-to-rvalue conversions without much effort.
>> >
>> > With C++ (especially with the use of references) there is a lot more
>> > cases to reason about.  It raises the question of whether we should try to
>> > recreate the logic in codegen in GRExprEngine to infer all of these cases,
>> > or to try and represent them explicitly in the AST.
>> >
>> > There are advantages to representing such conversions explicitly in the
>> > AST.  For one, the distinction of an lvalue versus an rvalue in the CFG
>> > probably evaporates.  Any place where we see an lvalue-to-rvalue conversion
>> > simply means the subexpression was a pointer (or reference), and the
>> > conversion represents a load.  Moreover, different analyses (not just
>> > GRExprEngine) will be able to more adeptly reason about the semantics of
>> > analyzed code without having the burden to re-infer where all the
>> > lvalue-to-rvalue conversions take place.
>> >
>> > The other advantage is that it takes care of discrepanancies between C
>> > and C++ like the following:
>> >
>> > $ cat t.c
>> > void foo() {
>> >  int a = 0;
>> >  ++(++a);
>> > }
>> >
>> > $ clang -fsyntax-only t.c
>> > t.c:3:3: error: expression is not assignable
>> >  ++(++a);
>> >  ^ ~~~~~
>> > 1 error generated.
>> >
>> > $ clang -fsyntax-only -x c++ t.c
>> > $
>> >
>> > Here C and C++ differ on the semantics of '++'.  When the file is
>> > compiled as C code, we get an error because the expression '++a' evaluates
>> > to an rvalue.  When we compile the code as C++, however, the expression
>> > evaluate to an lvalue.  This difference isn't represented in the AST at all.
>> >
>> > NOTE: This is also another reason why GRExprEngine might crash in some
>> > cases with '++' on C++ code, because it assumes the C semantics.
>> >
>> > To explain the difference some more, consider:
>> >
>> >  int a = 0;
>> >  int b = ++a;
>> >
>> > With C, the '++a' evaluates to an rvalue immediately, which is assigned
>> > to 'b'.  In C++, the expression '++a' first evaluates to an lvalue, and then
>> > there is an implicit lvalue-to-rvalue conversion that loads the new value.
>> >  Of course the compiler doesn't need to generate more IR to represent this
>> > at the execution level, but this is what is conceptually happening in the
>> > language.
>> >
>> > So to summarize, these discrepancies of how lvalues and rvalues are
>> > reasoned about in C versus C++ is really something we need to solve
>> > systematically in the analyzer (and/or ASTs) with a cohesive solution.
>> >  That's why I think this individual case that you reported is only
>> > symptomatic of a bigger issue that needs a well-designed solution.
>> >
>> > Cheers,
>> > Ted
>>
>>
>>
>> --
>> Zhanyong
>>
>> _______________________________________________
>> cfe-dev mailing list
>> cfe-dev at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>
>



-- 
Zhanyong




More information about the cfe-dev mailing list