[cfe-commits] [PATCH] Expressions have lvalues and rvalues

Wed Oct 15 00:27:38 PDT 2008

On Oct 12, 2008, at 2:12 AM, Zhongxing Xu wrote:
>
>
> I don't see these are redundant. Values are raw bits interpreted  
> within some context. We make a fundamental distinction between two  
> types of values: value that represents some address and value that  
> not. So I prefer we have two kinds of ConcreteInt: lval::ConcreteInt  
> and nonlval::ConcreteInt. And they are seldom same.  
> lval::ConcreteInt usually is very large.

You're right.  I originally wrote this code this way for this very  
reason, and in the course of this discussion I confused myself.  I was  
just trying to think of whether or not an integer value was really an  
lvalue.  Yes it can represent a location, but is it really an lvalue.   
The current way things are "typed" with lval and nonlval, however,  
makes the analyzer readily understand code like the following:

int foo() {
   return *&*((int*) 0xa0000);
}

In this example, the pointer cast causes the integer literal to be  
treated as an lval::ConcreteInt.  This isn't really an lvalue though;  
it's really an rvalue that represents the location of something in  
memory (at address 0xa0000).  The '*' operator, however, expects an  
rvalue (per the wording in the C++ standard).  So while nonlval and  
lval do reason about locations, they aren't really lvalues or rvalues  
at all; just something that approximates them.  Hence the motivation  
to change their names.
>
>
> One thing about this is that it makes the transfer function  
> structure basically fall out from what's in the C/C++ standards.   
> For example, the '*' operator essentially has the following type  
> signature:
>
> * : rval -> lval
>
> Similarly, references to variables and the '&' operator could be  
> represented as follows:
>
> variable reference:  (declrefexpr) -> lval
> & : lval -> rval  (with the rval being an rval::MemoryRegion)
>
> In contexts where an lval is used as an rval, we have an implicit  
> conversion (as stated in the C++ standard).  Such implicit  
> conversations would be represented by a transfer function, which  
> cause a new state and ExplodedNode to be created to represent the  
> effect of this conversion.  For example, an implicit conversion from  
> an lval to rval could result in a value load (e.g., EvalLoad, which  
> would have the type signature lval -> rval).
>
> I don't understand your meaning very clearly. For * operator, we  
> just get its operand's rvalue, which is a location value. If we are  
> at the LHS of an assignment, this location value is what we want. If  
> we are at the RHS of an assignment, we do another EvalLoad with this  
> location value. For & operator, we get its operand's lvalue, and  
> this location value is the rvalue of the whole expression.

My pedantic point was that lval:: and nonlval:: classes are not  
lvalues and rvalues.  We could change them to be as such; in this case  
the transfer function of operator '*' would always return an lvalue,  
and then whatever used that lvalue would then perform the implicit  
conversion.  We invert this in the static analyzer right now; the  
transfer function logic for '*' does the implicit lvalue->rvalue  
conversion based on context (i.e., if the asLVal flag is not set).   
The current approach makes sense since we want to associate with a  
given expression the value the expression evaluates to; this includes  
the result of implicit conversions.

At the end of the day, however, lval:: and nonlval:: classes are not  
lvalues/rvalues respectively (in the C++ parlance).  My question was  
whether or not we should change the use of these classes so that they  
EXACTLY map to lvalues/rvalues.  After looking at the code, reviewing  
the patch, thinking about the overall design of GRExprEngine, and all  
the comments made here, I'm not in favor of this idea anymore.  I  
think it is more useful to reason about locations versus non-locations  
than lvalues versus rvalues.

>
> I am not letting the distinction between rvalues and lvalues happen  
> in the Store. They do happen in GRExprEngine in my patch. Notice  
> that I only added a getLValue() to StoreManager. The intention is to  
> let the Store to return the lvalue of an expression.

To me that is the point of MemRegion.  Shouldn't MemRegion be able to  
represent the location in all cases?  In my mind, an implementation of  
Store should not depend on RValues.h.  If a Store just reasons about  
regions, it doesn't need to think about lval objects, or is this not  
the case?

I know I'm the one who came up with lval::FieldOffset,  
lval::ArrayOffset, etc.  I'm questioning this decision.  It just seems  
to be hard-coding a particular Store's conception of memory into the  
lval classes.  These concepts can easily be represented (far more  
elegantly) as regions.  Once you are just dealing with regions,  
StoreManager::getLValue() only needs to return a region type.

> Because for different stores, we may have different representation  
> of location values for the same expression. For example, in  
> BasicStore, we may have a different location value for the lvalue of  
> expression a[3] than in RegionStore

To me Stores can return different regions for a[3].  However, having  
the Store return an lval:: object has one distinct advantage that I  
see: we don't have to define "Undefined" or "Unknown" for regions.

> . So I let StoreManager to determine the concrete representation of  
> an expression's lvalue. Whether we want the lvalue or rvalue of an  
> expression is decided by the GRExprEngine according to the context,  
> e.g., the position of the expression in the parent expression.

Right.

> Yeah, we should limit this rvalue/lvalue distinction within  
> GRExprEngine. And the intention of my patch is this! Let me  
> summarize my patch in the following:
>
> In GRExprEngine, we know when we want the rvalue of an expression  
> and when we want the lvalue of an expression. If we want the lvalue  
> of an expression, we first evaluate all of its sub-exprs, then we  
> ask the Store what the concrete form of its lvalue. The Store return  
> us a location value.

This all makes much more sense to me now.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20081015/4c96bf9c/attachment.html>