[cfe-commits] [PATCH] Expressions have lvalues and rvalues

Zhongxing Xu xuzhongxing at gmail.com
Fri Oct 10 19:09:11 PDT 2008


Hi Sebastian,

I know where the confusion comes from: we are from two different points of
view. You are from pure programming language and compiler point of view. I
am from a mathematical interpretation of programs point of view.

In PL/compiler view, lvalue/rvalue is equivalent to an expression. So we
might be more explicit to directly say an lvalue expression or rvalue
expression. An lvalue expression is an expression refering to an object in
the memory (per C standard). So an lvalue expression is an real object
exists in memory. And the rest expressions are rvalue expressions. This is
very clear.

But in static analysis, only having a programming language level expression
concept is not enough. We want to do interpretation of programs and use some
mathematics to reason about programs. So we designed an interpretation model
of programs as follows.

Basically, we divide the things into 3 domains: language domain, memory
domain, mathematics domain. For an expression, we conceptly view it as a
3-tuple.

expression --- memory  ----- value

The first expression represent the concept in the programming language. It
is a textual representation of the expression. In clang, they correspond to
numerous *Expr classes.

Second, the expression (might) represents an object in memory. In
libAnalysis, we designed the region model for modeling memory.

Third, the memory object stores a value. In libAnalysis, we use RVal to
abstractly represent this value.

Note that, here the value is in the mathematical sense: it is a real value,
represented by bits in the memory.

In this interpretation model, I redefined the concept lvalue and rvalue
slightly. I made them a property of expressions, not as expressions
themselves. I define the lvalue of an expression as the memory address of
the object it refering to, and rvalue of an expression as the value it
evaluates to when placed at the RHS of an assignment (I don't have a better
definition currently).

All expressions have rvalue, because it should evaluate to some value. But
not all expressions have lvalue. Only expressions refering memory objects
have lvalue. For example, &x has no lvalue, but has rvalue, which is x's
lvalue.

Now let's talk about where the LVal/NonLVal comes from. Look at the above
3-tuple:

expression -- memory -- value

Remember we are doing static analysis, and we are going to use some
mathematics to analyze program. So what objects are we processing in
mathematics? They are the value in the memory, i.e. the value on the
rightmost of the tuple. Literally they are raw bits. But we made them more
abstract into RVal. Furthermore, we noticed that the most essential
distinction among these varous abstract values (RVals) is that some of them
are address value, some of them are not. So here comes LVal and NonLVal. We
use LVal to represent an abstract address value, and NonLVal to represent an
numerical value.

So, LVal and NonLVal are used to represent the abstract values in the
mathmatical sense, that has nothing to do with expressions in the
programming language sense. We need these mathematical values in our
symbolic analysis of the program. They are conceptly values stored in our
simulated computer memory. They are in a different world from C/C++
expressions.

An expression's lvalue evaluates (through our interpretation) into a LVal.
An expression's rvalue evaluates (through our interpretation) into a NonLVal
(e.g. 2*x+3), or a LVal (e.g. &x).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20081011/31570fc3/attachment.html>


More information about the cfe-commits mailing list