[LLVMdev] Can a name in LLVM assembly language hold two types of value at the same time
Reid Spencer
rspencer at reidspencer.com
Wed Sep 6 21:11:01 PDT 2006
Zhongxing Xu,
On Thu, 2006-09-07 at 10:58 +0800, Zhongxing Xu wrote:
> I am trying to symbolically execute LLVM assembly language. I found a
> possible
> semantic inconsistancy of the LLVM assembly language, or maybe my
> understanding
> is wrong.
>
> The C code is:
>
> #include <stdlib.h>
>
> 1 int f(void)
> 2 {
> 3 int a;
> 4 int *b = (int *) malloc(3*sizeof(int));
> 5 a = 3;
> 6 return 0;
> 7 }
>
> I compile it with llvm-gcc 4 front end. The generated LLVM assembly code
> is:
>
> 1 target endian = little
> 2 target pointersize = 32
> 3 target triple = "i686-pc-linux-gnu"
>
> 4 implementation ; Functions:
>
> 5 int %f() {
> 6 entry:
> 7 %retval = alloca int, align 4 ; <int*> [#uses=2]
> 8 %tmp = alloca int, align 4 ; <int*> [#uses=2]
> 9 %a = alloca int, align 4 ; <int*> [#uses=1]
> 10 %b = alloca int*, align 4 ; <int**> [#uses=1]
> 11 "alloca point" = cast int 0 to int ; <int> [#uses=0]
> 12 %tmp = call sbyte* %malloc( uint 12 ) ; <sbyte*>
> [#uses=1]
> 13 %tmp1 = cast sbyte* %tmp to int* ; <int*>
> [#uses=1]
> 14 store int* %tmp1, int** %b
> 15 store int 3, int* %a
> 16 store int 0, int* %tmp
> 17 %tmp = load int* %tmp ; <int> [#uses=1]
> 18 store int %tmp, int* %retval
> 19 br label %return
>
> 20 return: ; preds = %entry
> 21 %retval = load int* %retval ; <int> [#uses=1]
> 22 ret int %retval
> 23 }
>
> declare sbyte* %malloc(uint)
>
>
> After line 8, %tmp holds a pointer to stack, whose type is int*
> After line 12, %tmp holds a pointer to heap, whose type is sbyte*
SSA Register names, like %tmp in your example, are unique within their
type. They are *not* scoped variable names. Furthermore, due to SSA
requirements, each register name can only be assigned ones (one def,
zero or more uses). It would be illegal to have another int* assigned
to the variable %tmp because that violates SSA rules. However, you can
have a multitude of %tmp registers as long as they are all of different
types. Note that LLVM will rename a register if it finds a duplicate, as
was the case for %tmp1 on line 13. This was necessary because %tmp and %
tmp1 both have the same type (sbyte*).
>
> At line 16, value 0 is to be stored to a memory location of type int
> pointed to by %tmp. But at this time %tmp is holding a pointer to
> heap of type sbyte.
Actually its not. The two %tmp names are referring to things of
different types. One is int*, the other is sbyte*. These are what we
loosely call "type planes" in LLVM. That is, within a given type plane
all names are unique. But there is no requirement for names to be unique
across type planes.
So, the %tmp on line 16 is referring to the %tmp defined on line 8.
> And the heap should not be written to. (There is
> no assignment to b[0] in the C code.)
The stack is written to through int* %tmp (def on line 8) but the heap
is not written.
> So I guess that %;tmp also holds its original value, which is a pointer
> to stack of type int.
Correct.
> And we can decide which location to store according to the type.
Righ.
>
> Could someone explain this for me? Thanks.
I think I just did.
You might want to read the http://llvm.org/docs/LangRef.html document.
Reid.
More information about the llvm-dev
mailing list