[cfe-commits] [PATCH] Expressions have lvalues and rvalues

Sat Oct 11 20:58:28 PDT 2008

On Oct 11, 2008, at 8:16 PM, Zhongxing Xu wrote:

> On Sun, Oct 12, 2008 at 11:14 AM, Zhongxing Xu  
> <xuzhongxing at gmail.com> wrote:
> So how about calling them Address and NonAddress? This would get rid  
> of the dangerous similarity between LVal and LValue.
>
> Sebastian
>
> What about LocVal and NonLocVal?
>
>
> How about:
>
> RVal => AVal (abstract value)
> LVal => LocVal (location value)
> NonLVal => NonLocVal (non-location value)

Sounds good, other than "AVal" should probably be something like  
"AbstractVal", since we don't use it that much and the "A" doesn't  
really mean anything unless you look at the comments (can't use "Abs"  
because someone might think it means "absolute").  ExprVal is also a  
possibility (see later in this email for more comments).

Getting back to the points both you and Sebastian made, I'm wondering  
if there is any advantage with changing the implementation of RVal/ 
LVal to be something much closer to the way C++ reasons about these  
concepts.  This might in the long run clean up a whole bunch of  
things, especially as we add more support for C++.

I don't have any concrete ideas, but there exists a lot of redundancy  
between the lval and nonlval classes.  For example, lval::ConcreteInt  
and nonlval::ConcreteInt are essentially the same class representing  
the same data.  The same goes for lval::SymbolVal and  
nonlval::SymbolVal.  The distinction between the lval:: and nonlval::  
versions is purely context, which is often redundant since one often  
(always?) knows the current context and how these values should be  
interpreted.  If we modeled lvalues in the sense that Sebastian  
articulated, we would probably only have a handful of classes:  
lval::MemoryRegion, lval::Undefined, lval::Unknown, and that's  
probably it.  We would have an nonlval::MemoryRegion (or rather  
rval::MemoryRegion if we want nonlval to represent rvalues), but  
symbols, integers, etc., would always be rvalues.

One thing about this is that it makes the transfer function structure  
basically fall out from what's in the C/C++ standards.  For example,  
the '*' operator essentially has the following type signature:

* : rval -> lval

Similarly, references to variables and the '&' operator could be  
represented as follows:

variable reference:  (declrefexpr) -> lval
& : lval -> rval  (with the rval being an rval::MemoryRegion)

In contexts where an lval is used as an rval, we have an implicit  
conversion (as stated in the C++ standard).  Such implicit  
conversations would be represented by a transfer function, which cause  
a new state and ExplodedNode to be created to represent the effect of  
this conversion.  For example, an implicit conversion from an lval to  
rval could result in a value load (e.g., EvalLoad, which would have  
the type signature lval -> rval).

Zhongxing: In a roundabout way, I think your patch (which I have now  
looked at) is trying to resolve some of the confusion between rvalues  
and lvalues, but I'm wondering if the distinction between rvalues and  
lvalues really should happen in the Store.  This of course is a  
naturally progression of the current design, but after Sebastian's  
points I'm wondering if that is the right direction.  In my mind, all  
the store should care about is the mapping between regions and  
abstract values (a notion that has nothing to do with LVal and  
NonLval) and the mapping between variable names and the their  
regions.  These are pretty clean interfaces that don't have to get  
tangled with the notion of rvalues and lvalues, which seem more  
related to how GRExprEngine should handle individual expressions.

In a nutshell, I'm wondering if we should do the following:

1) Change "RVal" to "ExprVal" (representing the value of an  
expression).  ExprVal makes it clear that these objects represent the  
evaluation of an expression.

2) Change 'nonlval::" to "rval::", and remove most of the lval classes  
such as lval::SymbolicInt and lval::ConcreteInt.  Add an  
"rval::MemoryRegion" class that mirrors "lval::MemoryRegion".

3) Change GRExprEngine to reason about rvals (rvalues) and lvals  
(lvalues) in the same way the C++ standard does.  This is a big task,  
but I think it is mainly code restructuring.  Loads would be handled  
as transfer functions (i.e., EvalLoad) doing the implicit conversions  
between lvals -> rvals.

I see three advantages of these changes:

1) The structure of GRExpEngine more closely reflects the semantics of  
C/C++ as given by the respective standards.  This will make the  
analyzer much easier to extend potential C++ 0X features, such as  
rvalue references, as the design of how to implement these features in  
GRExprEngine should be more obvious and intuitive.  This should also  
address Sebastian's original concerns (and the concerns of those with  
a similar perspective).  It will also make it much clearer for those  
new to the libAnalysis/GRExprEngine about what is going on.

2) There will probably be a significant code reduction/simplification  
in many cases, as the code duplication for lval::SymbolicVal and  
nonlval::SymbolicVal, lval::ConcreteInt and nonlval::ConcreteInt, gets  
eradicated.

3) The meaning/purpose of LVal/RVal/NonLval (whatever we call them)  
will be made clear: they exist purely for GRExprEngine to correctly  
implement/call the right transfer functions.  The Environment class  
will be a mapping from "Expr* -> ExprVal", and implementations of the  
Store won't  care about ExprVals at all; just regions and the values  
they map to.  I see this as a being a huge conceptually cleanup;  
implementations of Store won't have to reason about rvalues and  
lvalues at all, just locations and the values stored at those  
locations.  The distinction between rvalues and lvalues is something  
that C and C++ language lawyers should care about, but people wishing  
to model the heap in a particular implementation of store shouldn't  
have to care about these things at all.

What do you think?  It seems like important changes like this (if we  
make them) should get in before the system gets any larger and more  
complex.  Note that I don't think these changes are necessarily the  
best ones; I'm just very much interested in discussing whether or not  
the overall design of RVals/Lvals needs to be changed.

Ted
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20081011/f30e57f2/attachment.html>