<div dir="ltr"><br><br><div class="gmail_quote">On Wed, Oct 15, 2008 at 3:27 PM, Ted Kremenek <span dir="ltr"><<a href="mailto:kremenek@apple.com">kremenek@apple.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div style=""><br><div><div class="Ih2E3d"><div>On Oct 12, 2008, at 2:12 AM, Zhongxing Xu wrote:</div><blockquote type="cite"><div dir="ltr"><div class="gmail_quote"><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div><font color="#000000"><br></font> </div></blockquote><div><br>I don't see these are redundant. Values are raw bits interpreted within some context. We make a fundamental distinction between two types of values: value that represents some address and value that not. So I prefer we have two kinds of ConcreteInt: lval::ConcreteInt and nonlval::ConcreteInt. And they are seldom same. lval::ConcreteInt usually is very large.</div>
</div></div></blockquote><div><br></div></div><div>You're right. I originally wrote this code this way for this very reason, and in the course of this discussion I confused myself. I was just trying to think of whether or not an integer value was really an lvalue. Yes it can represent a location, but is it really an lvalue. The current way things are "typed" with lval and nonlval, however, makes the analyzer readily understand code like the following:</div>
<div><div><br></div><div>int foo() {</div><div> return *&*((int*) 0xa0000);</div><div>}</div></div><div><br></div><div>In this example, the pointer cast causes the integer literal to be treated as an lval::ConcreteInt. This isn't really an lvalue though; it's really an rvalue that represents the location of something in memory (at address 0xa0000). The '*' operator, however, expects an rvalue (per the wording in the C++ standard). So while nonlval and lval do reason about locations, they aren't really lvalues or rvalues at all; just something that approximates them. Hence the motivation to change their names.</div>
</div></div></blockquote><div><br>Right. Let me describe how we evaluate this statement from another direction. As in 'return' expression, we expect the rvalue of *&*((int*) 0xa0000). To get *expr's rvalue we should first get expr's rvalue, which is a location value, then retrieve the value stored there. So we evaluate &*((int*) 0xa0000)'s rvalue. &expr's rvalue is expr's lvalue. And *((int*) 0xa0000) is really an lvalue expression. We evaluate its lvalue. *expr's lvalue is expr's rvalue. So we evaluate ((int*) 0xa0000)'s rvalue. ((int*) 0xa0000) is a cast expression. 0xa0000 is an non-location integer value, (int*) casts it to location value 0xa0000, which is what we expect. Everything works just fine.<br>
</div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div style=""><div><div></div><div class="Ih2E3d"><blockquote type="cite"><div dir="ltr">
<div class="gmail_quote"><div> </div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"> <div><div></div><div><br></div><div>One thing about this is that it makes the transfer function structure basically fall out from what's in the C/C++ standards. For example, the '*' operator essentially has the following type signature:</div>
<div><br></div><div>* : rval -> lval</div><div><br></div><div>Similarly, references to variables and the '&' operator could be represented as follows:</div><div><br></div><div>variable reference: (declrefexpr) -> lval</div>
<div>& : lval -> rval (with the rval being an rval::MemoryRegion)</div><div><br></div><div>In contexts where an lval is used as an rval, we have an implicit conversion (as stated in the C++ standard). Such implicit conversations would be represented by a transfer function, which cause a new state and ExplodedNode to be created to represent the effect of this conversion. For example, an implicit conversion from an lval to rval could result in a value load (e.g., EvalLoad, which would have the type signature lval -> rval).</div>
</div></blockquote><div><br>I don't understand your meaning very clearly. For * operator, we just get its operand's rvalue, which is a location value. If we are at the LHS of an assignment, this location value is what we want. If we are at the RHS of an assignment, we do another EvalLoad with this location value. For & operator, we get its operand's lvalue, and this location value is the rvalue of the whole expression.</div>
</div></div></blockquote><div><br></div></div><div>My pedantic point was that lval:: and nonlval:: classes are not lvalues and rvalues. We could change them to be as such; in this case the transfer function of operator '*' would always return an lvalue, and then whatever used that lvalue would then perform the implicit conversion. We invert this in the static analyzer right now; the transfer function logic for '*' does the implicit lvalue->rvalue conversion based on context (i.e., if the asLVal flag is not set). The current approach makes sense since we want to associate with a given expression the value the expression evaluates to; this includes the result of implicit conversions.</div>
<div><br></div><div>At the end of the day, however, lval:: and nonlval:: classes are not lvalues/rvalues respectively (in the C++ parlance). My question was whether or not we should change the use of these classes so that they EXACTLY map to lvalues/rvalues. After looking at the code, reviewing the patch, thinking about the overall design of GRExprEngine, and all the comments made here, I'm not in favor of this idea anymore. I think it is more useful to reason about locations versus non-locations than lvalues versus rvalues.</div>
</div></div></blockquote><div><br>Right.<br> </div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div style=""><div><div></div><div class="Ih2E3d">
<div><br></div><blockquote type="cite"><div dir="ltr"><div class="gmail_quote"><div><br>I am not letting the distinction between rvalues and lvalues happen in the Store. They do happen in GRExprEngine in my patch. Notice that I only added a getLValue() to StoreManager. The intention is to let the Store to return the lvalue of an expression.</div>
</div></div></blockquote><div><br></div></div><div>To me that is the point of MemRegion. Shouldn't MemRegion be able to represent the location in all cases? In my mind, an implementation of Store should not depend on RValues.h. If a Store just reasons about regions, it doesn't need to think about lval objects, or is this not the case?</div>
</div></div></blockquote><div><br>Yes. Store should not depend on RValues.h. Store should just reasons about regions.<br> </div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div style=""><div><div></div><div><br></div><div>I know I'm the one who came up with lval::FieldOffset, lval::ArrayOffset, etc. I'm questioning this decision. It just seems to be hard-coding a particular Store's conception of memory into the lval classes. These concepts can easily be represented (far more elegantly) as regions. Once you are just dealing with regions, StoreManager::getLValue() only needs to return a region type.</div>
</div></div></blockquote><div><br>Yes, I also feel that here needs some redesign. But different store will still return different region values for an expression, because of their different granularity of modeling. I have no clear idea about this currently. I expect to do this in a separate patch.<br>
</div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div style=""><div><div></div><div class="Ih2E3d"><br><blockquote type="cite"><div dir="ltr">
<div class="gmail_quote"><div> Because for different stores, we may have different representation of location values for the same expression. For example, in BasicStore, we may have a different location value for the lvalue of expression a[3] than in RegionStore</div>
</div></div></blockquote><div><br></div></div><div>To me Stores can return different regions for a[3]. However, having the Store return an lval:: object has one distinct advantage that I see: we don't have to define "Undefined" or "Unknown" for regions.</div>
</div></div></blockquote><div><br>When we query Store for the lvalue of an expression, Store should return us an abstract value, which may be a MemRegionVal or UnknownVal or UndefinedVal.<br>This may be clear when we are replacing lval::FieldOffset and lval::ArrayOffset with something else.<br>
<br></div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div style=""><div><div></div><div class="Ih2E3d"><br><blockquote type="cite"><div dir="ltr">
<div class="gmail_quote"><div>. So I let StoreManager to determine the concrete representation of an expression's lvalue. Whether we want the lvalue or rvalue of an expression is decided by the GRExprEngine according to the context, e.g., the position of the expression in the parent expression.</div>
</div></div></blockquote><div><br></div></div>Right.</div><div><br></div><div><div class="Ih2E3d"><blockquote type="cite"><div dir="ltr"><div class="gmail_quote"><div>Yeah, we should limit this rvalue/lvalue distinction within GRExprEngine. And the intention of my patch is this! Let me summarize my patch in the following:<br>
<br>In GRExprEngine, we know when we want the rvalue of an expression and when we want the lvalue of an expression. If we want the lvalue of an expression, we first evaluate all of its sub-exprs, then we ask the Store what the concrete form of its lvalue. The Store return us a location value.</div>
</div></div></blockquote><div><br></div></div>This all makes much more sense to me now.</div><div><br></div></div></blockquote></div><br></div>