[PATCH] D78439: [AVR] Fix miscompilation of zext + add
Dylan McKay via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Sat Apr 18 23:25:36 PDT 2020
dylanmckay added a comment.
I'm not sure that this is a miscompile. Note that AVR has no "add with immediate" instruction, instead subsumed by the "subtract with immediate" instruction, which of course with immediate values being immediately known to the compiler, can be negated directly at the callsite to effectively form a "add with immediate" instruction.
In this case, with the two snippets
(trunk)
foo:
mov r18, r20
clr r19
add r22, r18
adc r23, r19
sbci r24, 0
sbci r25, 0
ret
(with patch)
foo:
mov r18, r20
clr r19
ldi r20, 0
ldi r21, 0
add r22, r18
adc r23, r19
adc r24, r20
adc r25, r21
ret
The first two instructions and the final `ret` are identical in both, let's remove.
(trunk)
add r22, r18
adc r23, r19
sbci r24, 0
sbci r25, 0
(with patch)
ldi r20, 0
ldi r21, 0
add r22, r18
adc r23, r19
adc r24, r20
adc r25, r21
With the patch, the LDIs and `r20` and `r21` aren't used until the last two ADCs, let's reorder
(with patch)
add r22, r18
adc r23, r19
ldi r20, 0
ldi r21, 0
adc r24, r20
adc r25, r21
This is equivalent assembly as before, and now the two snippets both have the exact same initial two `add+adc` instructions. Let's remove from both snippets, leaving
(trunk)
sbci r24, 0
sbci r25, 0
(with patch)
ldi r20, 0
ldi r21, 0
adc r24, r20
adc r25, r21
So the only possible difference between the two variants, is in the addition of last two bytes from the 32-bit zero extended `i1` value.
The remaining (trunk) asm just subtracts zero from the upper half of the non-extended 32-bit integer `%a` in registers `r24:r25`. This is the compiler inserting an addition of the immediate `0` (as `-0 = 0`). This makes sense because the corresponding bits from the `i1` value have to be zero because it was zero-extended to 32-bits. A subtraction of zero is always redundant, therefore the two SBCIs from the first patch are completely redundant, and could be removed by the compiler in a future optimization.
The remaining (with patch) asm just adds zero to the upper half of the non-zero extended 32-bit integer `%a` in registers `r24:25` but with extra steps to load the zero value into a register.
The only two remaining patches are completely identical in terms of execution as far as I can tell.
In summary: I think this is not a miscompile, let me know if you still think otherwise.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D78439/new/
https://reviews.llvm.org/D78439
More information about the llvm-commits
mailing list