[LLVMdev] Mapping bytecode to X86

Mon Jun 26 12:23:31 PDT 2006

Dear guys,

    I am in need of more of your help. I'm implementing a register
allocator, and I am having problems to make it produce correct code.
Consider this program here:

int main(int argc, char ** argv) {
    int i, j, sum;
    i = argv[0][0];
    j = argv[0][1];
    sum = (i + j) * j;
    printf("Sum = %d\n", sum);
}

that maps to this llvm bytecode:

entry (0xa785590, LLVM BB @0xa77ebf8):
        FNSTCW16m <fi#0>, 1, %NOREG, 0
        MOV8mi <fi#0>, 1, %NOREG, 1, 2
        FLDCW16m <fi#0>, 1, %NOREG, 0
        %reg1024 = MOV32rm <fi#-2>, 1, %NOREG, 0
        %reg1025 = MOV32rm %reg1024, 1, %NOREG, 0
        %reg1026 = MOVSX32rm8 %reg1025, 1, %NOREG, 0
        %reg1027 = MOVSX32rm8 %reg1025, 1, %NOREG, 1
        ADJCALLSTACKDOWN 8
        %reg1028 = ADD32rr %reg1026, %reg1027
        %reg1029 = IMUL32rr %reg1028, %reg1027
        MOV32mr %ESP, 1, %NOREG, 4, %reg1029
        MOV32mi %ESP, 1, %NOREG, 0, <ga:.str_1>
        CALLpcrel32 <ga:printf>
        ADJCALLSTACKUP 8, 0
        %reg1030 = MOV32rr %EAX
        %reg1031 = IMPLICIT_DEF_GR32
        %EAX = MOV32rr %reg1031
        RET

My allocator produces this mapping:

FNSTCW16m              :=
MOV8mi                 :=
FLDCW16m               :=
MOV32rm            EAX :=
MOV32rm            EAX :=  EAX
MOVSX32rm8         ECX :=  EAX
MOVSX32rm8         EAX :=  EAX
ADJCALLSTACKDOWN       :=
ADD32rr            ECX :=  ECX EAX
IMUL32rr           EAX :=  ECX EAX
MOV32mr                :=  ESP EAX
MOV32mi                :=  ESP
CALLpcrel32            :=
ADJCALLSTACKUP         :=
IMPLICIT_DEF_GR32  EAX :=
RET                    :=

This mapping looks correct, but the final code generated by llvm is not
correct. Check the assembly code below: the add instruction should be
addl %eax, %ecs

main:
        subl $12, %esp
        fnstcw 10(%esp)
        movb $2, 11(%esp)
        fldcw 10(%esp)
        movl 20(%esp), %eax
        movl (%eax), %eax
        movsbl (%eax), %ecx
        movsbl 1(%eax), %eax
        addl %ecx, %ecx
        imull %ecx, %eax
        movl %eax, 4(%esp)
        movl $.str_1, (%esp)
        call printf
        #IMPLICIT_DEF %eax
        addl $12, %esp
        ret
        .size main, .-main

On the other hand, if I use the TwoAddressInstructionPass, then llvm
produces this code here:

FNSTCW16m                :=
MOV8mi                   :=
FLDCW16m                 :=
MOV32rm              EAX :=
MOV32rm              EAX :=  EAX
MOVSX32rm8           ECX :=  EAX
MOVSX32rm8           EAX :=  EAX
ADJCALLSTACKDOWN         :=
ADD32rr              ECX :=  EAX
IMUL32rr             ECX :=  EAX
MOV32mr                  :=  ESP ECX
MOV32mi                  :=  ESP
CALLpcrel32              :=
ADJCALLSTACKUP           :=
IMPLICIT_DEF_GR32    EAX :=
RET                      :=

which is correctly mapped by llvm to assembly code. Check the assembly
below:

main:
        subl $12, %esp
        fnstcw 10(%esp)
        movb $2, 11(%esp)
        fldcw 10(%esp)
        movl 20(%esp), %eax
        movl (%eax), %eax
        movsbl (%eax), %ecx
        movsbl 1(%eax), %eax
        addl %eax, %ecx
        imull %eax, %ecx
        movl %ecx, 4(%esp)
        movl $.str_1, (%esp)
        call printf
        #IMPLICIT_DEF %eax
        addl $12, %esp
        ret
        .size main, .-main

The problem is that, after the TwoAddressInstructionPass is used, the
code is no longer in SSA form, and my register allocator rely on
some SSA properties. I am using the Spiller in VirtRegMap.* to generate
the code, but the incorrect mapping still happens when I invoke the
setReg() method directly on machine operands. Any of you guys has some
hints to help me to produce correct mapping without using the Two
address pass?

Thanks a lot,

Fernando