<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/70577>70577</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            Incorrect code on AArch64 for call in a function with 'ghccc' calling convention
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            new issue
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          waterlens
      </td>
    </tr>
</table>

<pre>
    The following program contains a call to `printf` in a function `f1` with special calling convention `ghccc` (it has no callee-save registers). 
```llvm
target triple = "aarch64-unknown-linux-gnu"
; target triple = "x86_64-unknown-linux-gnu"

@.str = private unnamed_addr constant [6 x i8] c"test\0A\00", align 1

; Function Attrs: noinline nounwind uwtable
define dso_local ghccc void @f1() #0 {
entry:
  %call = call i32 (ptr, ...) @printf(ptr noundef nonnull dereferenceable(1) @.str)
  ret void
}

; Function Attrs: nofree nounwind
declare noundef i32 @printf(ptr nocapture noundef readonly) local_unnamed_addr #2

; Function Attrs: nounwind uwtable
define dso_local i32 @main() local_unnamed_addr #3 {
entry:
  call ghccc void @f1()
  %call = call i32 (ptr, ...) @printf(ptr noundef nonnull dereferenceable(1) @.str)
 ret i32 0
}
```

By its semantics, the expected output should be like:
```
test
test
```
Actually, using the latest version of llvm, the program compiled on AArch64 will output `test\n` and then get into an infinite loop.

This is due to LLVM produced the following problematic assembly. The `x30` register haven't be saved before the inner call to `printf`.
```
f1:                                     // @f1
        adrp    x0, .L.str
        add     x0, x0, :lo12:.L.str
 bl      printf
        ret
main: // @main
        stp     d15, d14, [sp, #-160]!          // 16-byte Folded Spill
        stp     d13, d12, [sp, #16]             // 16-byte Folded Spill
        stp     d11, d10, [sp, #32]             // 16-byte Folded Spill
        stp     d9, d8, [sp, #48]               // 16-byte Folded Spill
        stp     x29, x30, [sp, #64]             // 16-byte Folded Spill
        stp     x28, x27, [sp, #80]             // 16-byte Folded Spill
        stp     x26, x25, [sp, #96]             // 16-byte Folded Spill
        stp     x24, x23, [sp, #112]            // 16-byte Folded Spill
        stp     x22, x21, [sp, #128]            // 16-byte Folded Spill
        stp     x20, x19, [sp, #144]            // 16-byte Folded Spill
        bl      f1
        adrp    x0, .L.str
 add     x0, x0, :lo12:.L.str
        bl      printf
        ldp x20, x19, [sp, #144]            // 16-byte Folded Reload
        mov w0, wzr
        ldp     x22, x21, [sp, #128]            // 16-byte Folded Reload
        ldp     x24, x23, [sp, #112]            // 16-byte Folded Reload
        ldp     x26, x25, [sp, #96]             // 16-byte Folded Reload
        ldp     x28, x27, [sp, #80]             // 16-byte Folded Reload
        ldp     x29, x30, [sp, #64]             // 16-byte Folded Reload
        ldp     d9, d8, [sp, #48]               // 16-byte Folded Reload
        ldp     d11, d10, [sp, #32]             // 16-byte Folded Reload
        ldp     d13, d12, [sp, #16]             // 16-byte Folded Reload
        ldp     d15, d14, [sp], #160            // 16-byte Folded Reload
        ret
.L.str:
        .asciz "test\n"
```
Though AArch64 doesn't have a concept of caller-save registers, I think it's necessary to emit proper code to save the return value register in this case. A possible background is that we may use a special calling convention without callee-save registers to facilitate the speed of hot paths, but sometimes we need go into a cold path with usual calling convention.

By the way, for target x86-64, the behavior of generated program is correct.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzEWF1vqzwS_jXOzSgIDCHJRS7SVpVe6ezN7tHeVgaG4K1jI9vk4_31qzGkaUhO1NMc6UURQeB55pkPDzMI5-RGI67Y7InNXiai842xq73waBVqNylMdVz9bBBqo5TZS72B1pqNFVsojfZCagcCSqEUeAMsj1srta9ZHoPUIKDudOml0fSoTuj2XvoGXIulFCoIEmZp9A71aeGmKcuS1jK-kB4a4UCbsBZx6sQOweJGOo_WMb6MgMUvLF6zPO5_Su22_S0v7AY9eCtbhcDSF2CcC2HLJs-mnX7XZq-nSuruMN3ojnE-IKVPcFP0sMjf7kv25yyOnLdBqrVyJzxCp7XYYvUmqsqSuc4L7YHNnnI4gFyw2QuUjHOPzrPZc7ymU0y4_BmEkhsNyYWK9AleT75de28dS9egjdRKagRtOr2XuoJu70WhsBeqsKaHlTNvypRCQXA07IysgGUUH75gfAmMpzGw-VMvhdrbI0sHxQCMz0K8ybpwIVNOkWq9JbJRFAWM7JQK4UlgVGEN2mjdKQUVWqzRoi4xEOSLZBAj1zG-PKmz6APDwe75y1fcUFs8O-FkfKmExQ8igfUVyVK0vvu0yqKojFZH4hZ89nYRSMZT_hU-X4rGwGgrpB4CcVtj-uvYhHjcjuo_Ez6KHimIx-E7bdbP3ns6gvQOHG6F9rJ0xMc3CHhosfRYgel823lwjelUBQWCku_4Yf4IM2yl0dVoybr0nVAU3WfoHBUiUqcECcAOraMwmhpCRRnInMvftpWKSGlYr0NJgb1U6kSS5fGwmTWVMqErEtdAVUVqb0BokLqWWnoEZUwbfXbFz0Y6kA6qDqmw_vjx33-R5qorMeBc1uNC4VZ4WYJwDreFOkZANZvl8SElaz_qJTRih5rxuSfvUSklN9bGYgCVWqO9Wc2jm_6rE0rvrxyMvzL-OuTjkIn9ISrb0v8hDun3I-TPaEUF5xX9maVrZRLO0vWFRKF6kYH3BYzFIQnCBkvXnziFOxeLnQ-koEpmpK5KsqB19uTacMHTaZLHbPbCeHJlZZJPi6NHeDWqwgr-00qlfgWf9vB8DJ_k9Eq44cLfAU968HgMnvLHwZcBezGGzhZj6G-AH3hAp-QdwefZw8wPPJA-8PkYfBH_AfC8B5-NwZePB_TAsx48vcqWZBzRb4DzHjy5AufjmH4DvN-7yfIKPBtH9Mvgp83-GxXlN0rJSMmtiqKq9g9Y9m9URlSXyFuzg30A3v9tr5X-mXjdUnxGfzTV7qM_ukvuoz-6we-jP1qb7qE_Xlbvoj_8RriP_ujL7D769ZuYXr-Dgvh76B9NwbD3z610f0TClfJvOE9m-jzrXfZCPxvTbZqPVrAy6PpGi1oumpGNLrH11E6GSdZeTbLP8Bf4Rup3kJ7xuQONJTon7JG6MdxKT71eSx2aqUJbGBCocbPoO6thJ1R3hqQJ3FMfWQqHEayhNc7JQiEUonzfWGrpqcf0jfCwR9iKI3SOuN4Zz2l8N52_PY0Tp1qUUklPMy8xcy1Sj1xDYzy0wjfB0IK6eLNFL7foSLemVRszdMZQGlWF1f3ngs51N9lEo-mBFO5FaOdrY08z_GGRT_Ps1L8X2IidNJY4bVCjFTRZnLp68paxFksfTapVWi3TpZjgKsmXi1ma8jyfNKtlMp9nfJYsk5wv0kU-T4vlvCrSRGTxIs7SiVzxmKdJzBecZ_Msj1JRL_kiFVkdZ_myjFkW41ZIFdFgERm7mUjnOlzN49l8PlGiQOXCJxnONe4hPKS8m71M7IpkpkW3cSyLlXTenVG89ApXf-nBhD5PPs0n5JR-5rv4NhN8zPi8_-7C5zc8PemsWjXetzTQ9ttrI33TFVFptoy_9hNS-Ju21vwPS8_4ayDuGH8Nhv0_AAD__3tVK3E">