[llvm-dev] Branch folding optimisation on the AVR platform produces out of order code
carl-llvm-dev@petosoft.com via llvm-dev
llvm-dev at lists.llvm.org
Sat Dec 29 09:46:23 PST 2018
Hi everyone,
I'm getting a mis-compilation from LLVM IR on the AVR platform. Studying the assembly language, it looks like the basic blocks get out of order or something like that during the branch folding optimisation phase.
This is the source LLVM IR that I am working from:
define hidden void @_TF4main9i2cUpdateFT8registerVs5UInt85valueS0__T_(i8, i8) local_unnamed_addr #1 {
entry:
switch i8 %0, label %9 [
i8 6, label %2
i8 7, label %8
]
; <label>:2: ; preds = %entry
%3 = icmp ugt i8 %1, 90
%4 = icmp ult i8 %1, 5
%. = select i1 %4, i8 5, i8 %1
%5 = select i1 %3, i8 90, i8 %.
store i8 %5, i8* getelementptr inbounds (%Vs5UInt8, %Vs5UInt8* @_Tv4main11delayFactorVs5UInt8, i64 0, i32 0), align 1
%6 = zext i8 %5 to i32
%7 = mul nuw nsw i32 %6, 100
store i32 %7, i32* getelementptr inbounds (%Vs6UInt32, %Vs6UInt32* @_Tv4main7delayUsVs6UInt32, i64 0, i32 0), align 4
tail call void @_TF3AVR11writeEEPROMFT7addressVs6UInt165valueVs5UInt8_T_(i16 34, i8 %5)
br label %9
; <label>:8: ; preds = %entry
%not. = icmp ne i8 %1, 0
%.2 = zext i1 %not. to i8
store i1 %not., i1* getelementptr inbounds (%Sb, %Sb* @_Tv4main7enabledSb, i64 0, i32 0), align 1
tail call void @_TF3AVR11writeEEPROMFT7addressVs6UInt165valueVs5UInt8_T_(i16 35, i8 %.2)
br label %9
; <label>:9: ; preds = %8, %2, %entry
ret void
}
When this is compiled for the AVR architecture using llc, with -O3 it produces this assembly language:
00000420 <_TF4main9i2cUpdateFT8registerVs5UInt85valueS0__T_>:
420: 1f 93 push r17
422: 87 30 cpi r24, 0x07 ; 7
424: 09 f4 brne .+2 ; 0x428 <LBB4_1>
426: 26 c0 rjmp .+76 ; 0x474 <LBB4_8>
00000428 <LBB4_1>:
428: 86 30 cpi r24, 0x06 ; 6
42a: 09 f0 breq .+2 ; 0x42e <LBB4_2>
42c: 21 c0 rjmp .+66 ; 0x470 <LBB4_7>
0000042e <LBB4_2>:
42e: 85 e0 ldi r24, 0x05 ; 5
430: 65 30 cpi r22, 0x05 ; 5
432: 08 f0 brcs .+2 ; 0x436 <LBB4_4>
434: 86 2f mov r24, r22
00000436 <LBB4_4>:
436: 1a e5 ldi r17, 0x5A ; 90
438: 6b 35 cpi r22, 0x5B ; 91
43a: 08 f4 brcc .+2 ; 0x43e <LBB4_6>
43c: 18 2f mov r17, r24
0000043e <LBB4_6>:
43e: 10 93 b8 01 sts 0x01B8, r17
442: 61 2f mov r22, r17
444: 77 27 eor r23, r23
446: 24 e6 ldi r18, 0x64 ; 100
448: 30 e0 ldi r19, 0x00 ; 0
44a: 80 e0 ldi r24, 0x00 ; 0
44c: 90 e0 ldi r25, 0x00 ; 0
44e: 48 2f mov r20, r24
450: 59 2f mov r21, r25
452: 0e 94 33 16 call 0x2c66 ; 0x2c66 <__mulsi3>
456: 90 93 bf 01 sts 0x01BF, r25
45a: 80 93 be 01 sts 0x01BE, r24
45e: 70 93 bd 01 sts 0x01BD, r23
462: 60 93 bc 01 sts 0x01BC, r22
466: 82 e2 ldi r24, 0x22 ; 34
468: 90 e0 ldi r25, 0x00 ; 0
46a: 61 2f mov r22, r17
46c: 0e 94 fd 02 call 0x5fa ; 0x5fa <_TF3AVR11writeEEPROMFT7addressVs6UInt165valueVs5UInt8_T_>
00000470 <LBB4_7>:
470: 1f 91 pop r17
472: 08 95 ret
00000474 <LBB4_8>:
474: 21 e0 ldi r18, 0x01 ; 1
476: 60 30 cpi r22, 0x00 ; 0
478: 09 f4 brne .+2 ; 0x47c <LBB4_10>
47a: 20 e0 ldi r18, 0x00 ; 0
0000047c <LBB4_10>:
47c: 82 2f mov r24, r18
47e: 81 70 andi r24, 0x01 ; 1
480: 80 93 c0 01 sts 0x01C0, r24
484: 83 e2 ldi r24, 0x23 ; 35
486: 90 e0 ldi r25, 0x00 ; 0
488: 62 2f mov r22, r18
48a: 0e 94 fd 02 call 0x5fa ; 0x5fa <_TF3AVR11writeEEPROMFT7addressVs6UInt165valueVs5UInt8_T_>
; *** falls through to the next method ***
0000048e <_TF4main11updateDelayFVs5UInt8T_>:
48e: 1f 93 push r17
490: 95 e0 ...
It looks to me like either block LBB4_7 should be last or there should be a rjmp at the end of LBB4_10.
When compiled with -O0 the code stays in the correct order, although obviously it's much more verbose and inefficient, with loads of "register spill" stuff, whatever that means!
Using -print-after-all I managed to work out that the mis-optimisation occurred in the optimisation pass "Control Flow Optimizer".
The thing I'm finding confusing is that this pass seems to be shared code, not target specific code. Is there a way to understand how this pass works and in particular if there are any hooks or configuration coming in from the target specific AVR code in my branch that could be causing this behaviour?
Thanks for any help or advice you guys can give.
Carl
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20181229/5fa786e0/attachment.html>
More information about the llvm-dev
mailing list