[LLVMdev] Can a name in LLVM assembly language hold two types of value at the same time

Reid Spencer rspencer at reidspencer.com
Wed Sep 6 21:11:01 PDT 2006


Zhongxing Xu,


On Thu, 2006-09-07 at 10:58 +0800, Zhongxing Xu wrote:
> I am trying to symbolically execute LLVM assembly language. I found a  
> possible
> semantic inconsistancy of the LLVM assembly language, or maybe my  
> understanding
> is wrong.
> 
> The C code is:
> 
> #include <stdlib.h>
> 
> 1 int f(void)
> 2 {
> 3         int a;
> 4         int *b = (int *) malloc(3*sizeof(int));
> 5         a = 3;
> 6         return 0;
> 7 }
> 
> I compile it with llvm-gcc 4 front end. The generated LLVM assembly code  
> is:
> 
> 1  target endian = little
> 2  target pointersize = 32
> 3  target triple = "i686-pc-linux-gnu"
> 
> 4  implementation   ; Functions:
> 
> 5  int %f() {
> 6  entry:
> 7         %retval = alloca int, align 4           ; <int*> [#uses=2]
> 8         %tmp = alloca int, align 4              ; <int*> [#uses=2]
> 9         %a = alloca int, align 4                ; <int*> [#uses=1]
> 10        %b = alloca int*, align 4               ; <int**> [#uses=1]
> 11        "alloca point" = cast int 0 to int              ; <int> [#uses=0]
> 12        %tmp = call sbyte* %malloc( uint 12 )           ; <sbyte*>  
> [#uses=1]
> 13        %tmp1 = cast sbyte* %tmp to int*                ; <int*>  
> [#uses=1]
> 14        store int* %tmp1, int** %b
> 15        store int 3, int* %a
> 16        store int 0, int* %tmp
> 17        %tmp = load int* %tmp           ; <int> [#uses=1]
> 18        store int %tmp, int* %retval
> 19        br label %return
> 
> 20 return:         ; preds = %entry
> 21        %retval = load int* %retval             ; <int> [#uses=1]
> 22        ret int %retval
> 23 }
> 
> declare sbyte* %malloc(uint)
> 
> 
> After line 8, %tmp holds a pointer to stack, whose type is int*
> After line 12, %tmp holds a pointer to heap, whose type is sbyte*

SSA Register names, like %tmp in your example, are unique within their
type. They are *not* scoped variable names.  Furthermore, due to SSA
requirements, each register name can only be assigned ones (one def,
zero or more uses).  It would be illegal to have another int*  assigned
to the variable %tmp because that violates SSA rules.  However, you can
have a multitude of %tmp registers as long as they are all of different
types. Note that LLVM will rename a register if it finds a duplicate, as
was the case for %tmp1 on line 13. This was necessary because %tmp and %
tmp1 both have the same type (sbyte*). 
 
> 
> At line 16, value 0 is to be stored to a memory location of type int
> pointed to by %tmp. But at this time %tmp is holding a pointer to
> heap of type sbyte. 

Actually its not. The two %tmp names are referring to things of
different types. One is int*, the other is sbyte*. These are what we
loosely call "type planes" in LLVM. That is, within a given type plane
all names are unique. But there is no requirement for names to be unique
across type planes. 

So, the %tmp on line 16 is referring to the %tmp defined on line 8.

> And the heap should not be written to. (There is
> no assignment to b[0] in the C code.)

The stack is written to through int* %tmp (def on line 8) but the heap
is not written.

> So I guess that %;tmp also holds its original value, which is a pointer
> to stack of type int. 

Correct.

> And we can decide which location to store according to the type.

Righ.

> 
> Could someone explain this for me? Thanks.

I think I just did. 

You might want to read the http://llvm.org/docs/LangRef.html document.

Reid.




More information about the llvm-dev mailing list