<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/62012>62012</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
AVR asm doesn't validate `ldd` operands
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
workingjubilee
</td>
</tr>
</table>
<pre>
Consider the following LLVMIR program, derived from a Rust program featured in rust-lang/rust#109360 that misuses the `ldd` instruction on AVR:
```llvm
; ModuleID = 'avr_exampl.3c3314d5-cgu.0'
source_filename = "avr_exampl.3c3314d5-cgu.0"
target datalayout = "e-P1-p:16:8-i8:8-i16:8-i32:8-i64:8-f32:8-f64:8-n8-a:8"
target triple = "avr-unknown-unknown"
@alloc3 = private unnamed_addr constant <{ [2 x i8] }> <{ [2 x i8] c"\BB\AA" }>, align 1
; Function Attrs: nounwind
define dso_local void @main() unnamed_addr addrspace(1) #0 {
start:
%0 = tail call addrspace(0) i8
asm sideeffect alignstack
"\0A ldd ${0}, X+0 \0A ldd ${0}, X+1 \0A",
"=&r,{r27r26},~{sreg},~{memory}"
(ptr nonnull @alloc3) #2,
!srcloc !0
ret void
}
attributes #0 = { nounwind "target-cpu"="atmega328" }
attributes #1 = { nofree norecurse noreturn nosync nounwind readnone "target-cpu"="atmega328" }
attributes #2 = { nounwind }
!0 = !{i32 144, i32 161, i32 194, i32 223}
```
As I understand it, `ldd` only takes the Y or Z registers as its second operand (plus an offset). But this assembly is accepted without complaint, resulting in ill-formed assembly and peculiar binaries. This is what llc thinks the assembly is:
```asm
main: ; @main
ldi r26, lo8(alloc3)
ldi r27, hi8(alloc3)
ldd r24, X+0
ldd r24, X+1
ret
alloc3:
.ascii "\273\252"
```
But this is what comes out the other side of objdump:
```asm
000000a6 <main>:
a6: a0 e0 ldi r26, 0
a8: b1 e0 ldi r27, 1
aa: 88 81 ldd r24, Y+0
ac: 89 81 ldd r24, Y+1
ae: 08 95 ret
```
The address is loaded into the r26 and r27 register (which are concatenated together as the "X" register), and then reads are done from **Y** (a valid target of ldd, at least), which does not contain the address that was loaded and thus the programmer's intent is lost.
Strictly speaking, there is no problem here. The assembly is incorrect, so feeding it into an assembler is arguably UB. However, observing that it is incorrect is also trivial, thus an assembler may wish to reject it. I don't know LLVM's policy here, but I notice `llvm-project/llvm/test/CodeGen/AVR/inline-asm/inline-asm-invalid.ll` does exist, so there's that.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJykV0uP47gR_jX0pWBDomxLPvjgxzppYBcIJpvBTi4Diizb7KZIgaTs8SW_PShKfvUYc1mjIZFiPb96sFqEoA8WcclmazbbjkQXj84vz85_aHt472ptEEe1U5flxtmgFXqIR4S9M8adtT3A779__ePtC7TeHbxoGN-AQq9PqGDvXQMCvnQhXo9hjyJ2HhVoC74LcWyEPTC-ozXjRZ4tinkG8SgiNDp0AUNSx-aZUYrNM9A2RN_JqJ0FZ2H19QsrVizbsuz6nGf9nzGnZvhUrOEPpzqDb1tgxRYYL8XJf8cfomnNpJBFkU_VbCwP3SRjvOy5guu8xO97bdCKBgdG_gtG3jNG4Q8YQYkojLi4Ll5ZcfyvfNyyYpXPWbGqxrrqX9dtwfv3fJre-2G_H_a2GgtafFYUvW7No33jzn5Yd7bX941heE4zYYyTRWJpvT6JiNBZ8lJ9F0p5kM6GKCxZvmHlGthszeEH6IrNtsDKLSt-e3kkSdVss16z2Wa1YpwPxJQWwuiDhfzJkmINu8720VzF6AMrVmBdZ8_aqp5G4V5bBBXcd-OkMHByWgGbZo3QlvGK8cWz7fQIrZDIeJXTKeNFBqxcD2GNwsdbzgAAMD7LEhRRaANSGPMkIyMZurrT00-EBqgacL9HGXvfQhTy45msh6PfG6WA8Skr1xmBwjfwF-PrLJH9mibvCUgY3_ysoNgyPvd0VK49Lz2f97z_Y-U6eDzcdw02zl_Snn-WU7XRg3XWdsbALUUG_F4pzoOXxklaZfczjzFFaAhwuX2Mt4jR67qLGIagUMqW61vEyZ0-qcey7a6-cREbPIiCYt0n1Ath-YOwvUcE6zzKzod-FTtvwbpwsfKuzaNQ1ln8G2r5Cx-efSZ4htrMWbnWBYd8OqXQpuU8vy0Xt6-cF3cp1372KHQV4A06q9BTmSrQkVjvXdJZc4EoPob2-Q2ch_-Cx4MOEX0AEUDHAAGlswpci14k9KvWdAGEBbffB4yMLyaw7iLEoyamgE1tLkBrKbGNqOCs45FanHRNa4S2yRCPoTORbgdtQRsz3jvfoLpLIG0tys5o4aHWVniNYQJ_khod4Ez93xhJeu1H78OD9nvHv2IjwtDqU1MoVkCd5dojnvLWKJ3eVCV8A8ZVjFe3XB9o70QlER31C6LPYtXAMb1V9s8UD6f5Kyke45Bkva7i0_lEBKn10FV4WdBzxu_t_VWq3MJ3xVW6BgO49BnBxSP61MrA7cHV76pr2l8AnKWfmFP778H-7ZOZgm4zEBlgdmsW2cIozbLFgPoTNIJuQajz1_QpAPkTPV2CUFVQ5T2hSoQJ2W-fcRcyES-I-FH4J55nBUg8WQWLGRHfgvIS3j8pNZXyGBLCxgmVppvoEryez1Oye17eyo_q7HzU8gjCI121UkS0gqopugOmgIhh7uH8L-o-V1bKPrpKraJjmzpYSGIU9bE0cTG-Ynz1rX-RLgEnYbSCYVxwe8rFJCaCQUGDV5La26QcBrCO8sRGoW1ffIOHaTA7i5ufvSFdb-ww4zVkZhkIA7SxByXEySNo_45ey2guEFoUNGiSevIbidw6ElUbbIA-UVt4Kn_QVjrvUaZeExzsEVVqNrEHXtgrOfrUrPyhE8T7n_UE_unOeCITN-DqgP5EnMkvHZ-EJ04THM1XJy1Mb2PfHu_iG3GBsw5HiA48vie-OIE3CgjjZQSawNKQnEBpndHykvwieXUX4Y3Q1rIfcs2pGbfevSfndmmE5buIFKPdxin8B1rGdzT18p22RlscU2k-bsbapnhPjKGrIMUTf-hwRSv2yss-mpORWhZqUSzECJf5vMoW03ye56PjMl_U02khctwXalHLvF5MZ6rO8npe53lZ8ZFe8owX2TSrsiovinxS5tUMC17W83JWVYuMTTNshDYT8mPi_GGkQ-hwOedZzkdG1GhC-seDc4tnSIept21HfpmgqLtDYNPM6BDDXUrU0eBy9fVLGsPIwR7q5DbNsg8XYX-1hVHnzfIYY5suD75jfHfQ8djVE-maO9Kf8E8WBcZ3yeL_BwAA__-qPN2d">