<div dir="ltr"><div class="gmail_quote"><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div><div>Getting back to the points both you and Sebastian made, I'm wondering if there is any advantage with changing the implementation of RVal/LVal to be something much closer to the way C++ reasons about these concepts. This might in the long run clean up a whole bunch of things, especially as we add more support for C++.</div>
</div></blockquote><div><br>Agree.<br> </div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div><div></div><div><br></div><div>I don't have any concrete ideas, but there exists a lot of redundancy between the lval and nonlval classes. For example, lval::ConcreteInt and nonlval::ConcreteInt are essentially the same class representing the same data. The same goes for lval::SymbolVal and nonlval::SymbolVal. The distinction between the lval:: and nonlval:: versions is purely context, which is often redundant since one often (always?) knows the current context and how these values should be interpreted. If we modeled lvalues in the sense that Sebastian articulated, we would probably only have a handful of classes: lval::MemoryRegion, lval::Undefined, lval::Unknown, and that's probably it. We would have an nonlval::MemoryRegion (or rather rval::MemoryRegion if we want nonlval to represent rvalues), but symbols, integers, etc., would always be rvalues.</div>
</div></blockquote><div><br>I don't see these are redundant. Values are raw bits interpreted within some context. We make a fundamental distinction between two types of values: value that represents some address and value that not. So I prefer we have two kinds of ConcreteInt: lval::ConcreteInt and nonlval::ConcreteInt. And they are seldom same. lval::ConcreteInt usually is very large.<br>
</div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div><div></div><div><br></div><div>One thing about this is that it makes the transfer function structure basically fall out from what's in the C/C++ standards. For example, the '*' operator essentially has the following type signature:</div>
<div><br></div><div>* : rval -> lval</div><div><br></div><div>Similarly, references to variables and the '&' operator could be represented as follows:</div><div><br></div><div>variable reference: (declrefexpr) -> lval</div>
<div>& : lval -> rval (with the rval being an rval::MemoryRegion)</div><div><br></div><div>In contexts where an lval is used as an rval, we have an implicit conversion (as stated in the C++ standard). Such implicit conversations would be represented by a transfer function, which cause a new state and ExplodedNode to be created to represent the effect of this conversion. For example, an implicit conversion from an lval to rval could result in a value load (e.g., EvalLoad, which would have the type signature lval -> rval).</div>
</div></blockquote><div><br>I don't understand your meaning very clearly. For * operator, we just get its operand's rvalue, which is a location value. If we are at the LHS of an assignment, this location value is what we want. If we are at the RHS of an assignment, we do another EvalLoad with this location value. For & operator, we get its operand's lvalue, and this location value is the rvalue of the whole expression.<br>
</div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div><div></div>
<div><br></div><div>Zhongxing: In a roundabout way, I think your patch (which I have now looked at) is trying to resolve some of the confusion between rvalues and lvalues, but I'm wondering if the distinction between rvalues and lvalues really should happen in the Store. This of course is a naturally progression of the current design, but after Sebastian's points I'm wondering if that is the right direction. In my mind, all the store should care about is the mapping between regions and abstract values (a notion that has nothing to do with LVal and NonLval) and the mapping between variable names and the their regions. These are pretty clean interfaces that don't have to get tangled with the notion of rvalues and lvalues, which seem more related to how GRExprEngine should handle individual expressions.</div>
</div></blockquote><div><br>I am not letting the distinction between rvalues and lvalues happen in the Store. They do happen in GRExprEngine in my patch. Notice that I only added a getLValue() to StoreManager. The intention is to let the Store to return the lvalue of an expression. Because for different stores, we may have different representation of location values for the same expression. For example, in BasicStore, we may have a different location value for the lvalue of expression a[3] than in RegionStore. So I let StoreManager to determine the concrete representation of an expression's lvalue. Whether we want the lvalue or rvalue of an expression is decided by the GRExprEngine according to the context, e.g., the position of the expression in the parent expression.<br>
</div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div><div></div>
<div><br></div><div>In a nutshell, I'm wondering if we should do the following:</div><div><br></div><div>1) Change "RVal" to "ExprVal" (representing the value of an expression). ExprVal makes it clear that these objects represent the evaluation of an expression.</div>
<div><br></div><div>2) Change 'nonlval::" to "rval::", and remove most of the lval classes such as lval::SymbolicInt and lval::ConcreteInt. Add an "rval::MemoryRegion" class that mirrors "lval::MemoryRegion".</div>
<div><br></div><div>3) Change GRExprEngine to reason about rvals (rvalues) and lvals (lvalues) in the same way the C++ standard does. This is a big task, but I think it is mainly code restructuring. Loads would be handled as transfer functions (i.e., EvalLoad) doing the implicit conversions between lvals -> rvals.</div>
</div></blockquote><div><br>My feeling is that this is the right direction. But I need do some perusal of the C++ standard first.<br> </div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div><div></div>
<div><br></div><div>I see three advantages of these changes:</div><div><br></div><div>1) The structure of GRExpEngine more closely reflects the semantics of C/C++ as given by the respective standards. This will make the analyzer much easier to extend potential C++ 0X features, such as rvalue references, as the design of how to implement these features in GRExprEngine should be more obvious and intuitive. This should also address Sebastian's original concerns (and the concerns of those with a similar perspective). It will also make it much clearer for those new to the libAnalysis/GRExprEngine about what is going on.</div>
<div><br></div><div>2) There will probably be a significant code reduction/simplification in many cases, as the code duplication for lval::SymbolicVal and nonlval::SymbolicVal, lval::ConcreteInt and nonlval::ConcreteInt, gets eradicated.</div>
<div><br></div><div>3) The meaning/purpose of LVal/RVal/NonLval (whatever we call them) will be made clear: they exist purely for GRExprEngine to correctly implement/call the right transfer functions. The Environment class will be a mapping from "Expr* -> ExprVal", and implementations of the Store won't care about ExprVals at all; just regions and the values they map to. I see this as a being a huge conceptually cleanup; implementations of Store won't have to reason about rvalues and lvalues at all, just locations and the values stored at those locations. The distinction between rvalues and lvalues is something that C and C++ language lawyers should care about, </div>
</div></blockquote><div><br>Yeah, we should limit this rvalue/lvalue distinction within GRExprEngine. And the intention of my patch is this! Let me summarize my patch in the following:<br><br>In GRExprEngine, we know when we want the rvalue of an expression and when we want the lvalue of an expression. If we want the lvalue of an expression, we first evaluate all of its sub-exprs, then we ask the Store what the concrete form of its lvalue. The Store return us a location value.<br>
<br>If we want the rvalue of an expression, if the expression has lvalue, we get its lvalue, then ask Store what this location value maps to. If the expression does not have lvalue, then we can evaluate its rvalue directly.<br>
<br>Currently I don't know if this interpretation model is consistent or sufficient for C++. I'll do some homework for this.<br> </div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div><div>but people wishing to model the heap in a particular implementation of store shouldn't have to care about these things at all.</div>
<div><br></div><div>What do you think? It seems like important changes like this (if we make them) should get in before the system gets any larger and more complex. Note that I don't think these changes are necessarily the best ones; I'm just very much interested in discussing whether or not the overall design of RVals/Lvals needs to be changed.</div>
<div><br></div><font color="#888888"><div>Ted</div></font></div></blockquote></div><br></div>