<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/55833>55833</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            shift/zext-related miscompile by aarch64 backend
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            backend:AArch64,
            llvm:codegen,
            miscompilation
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          regehr
      </td>
    </tr>
</table>

<pre>
    here's a function:
```llvm
define i64 @f(i64) {
  %2 = lshr i64 %0, 32
  %3 = shl i64 %2, 16
  %4 = trunc i64 %3 to i32
  %5 = ashr exact i32 %4, 16
  %6 = zext i32 %5 to i64
  ret i64 %6
}
```
alive2 and the x86-64 backend agree that `f(0x0000800000000000) -> 0x00000000ffff8000`

using this reasoning:
```
i64 %2 = #x0000000000008000 (32768)
i64 %3 = #x0000000080000000 (2147483648)
i32 %4 = #x80000000 (2147483648, -2147483648)
i32 %5 = #xffff8000 (4294934528, -32768)
i64 %6 = #x00000000ffff8000 (4294934528)
```

but separately, it's easy to see that since the last instruction is a zext, the high part of the result must contain zeroes (unless there's been a poison-related problem, but it doesn't look like there has been).

here's what we get from the aarch64 backend in llvm 14 and also top-of-tree:
```
Johns-MacBook-Pro:~ regehr$ ~/llvm-project/for-alive/bin/llc foo.ll -o -
        .section        __TEXT,__text,regular,pure_instructions
        .build_version macos, 12, 0
        .globl  _f                              ; -- Begin function f
        .p2align        2
_f:                                     ; @f
        .cfi_startproc
; %bb.0:
        sbfx    x0, x0, #32, #16
        ret
        .cfi_endproc
                                        ; -- End function
.subsections_via_symbols
Johns-MacBook-Pro:~ regehr$ 
```

this ends up with a result where all the high bits are set:
```
Johns-MacBook-Pro:~ regehr$ cat foo.c
#include <stdio.h>

unsigned long f(unsigned long);

int main(void) {
  printf("%lx\n", f(0x0000800000000000ULL));
}
Johns-MacBook-Pro:~ regehr$ ~/llvm-project/for-alive/bin/llc foo.ll && clang foo.c foo.s && ./a.out
ffffffffffff8000
Johns-MacBook-Pro:~ regehr$ 
```

cc @ornata @nunoplopes @ryan-berger @nbushehri @zhengyang92 @aqjune @Hatsunespica
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJy1Vk2PozgQ_TXkYoGIDQQOOXTSaa1Ws9IeZqW9RQYMeNqxWdt0J_Prt2wgoTvdo9GsFiFi7Pp89apIqerLtmOaBXhjEEXNICvLlQzIQxA_BvFDkMXjLcTLadyqWcMlQzxLUJDETYBzWAa4QMFmN0ogFOAUo4A8ImE6PYriNA7wHhG8kCFexnRiFsFOZJ0tRBIvYjUENgsRZBXibwylXoo6Z-xMK-uOvfadvcxLfmfnq0zqzUEKk5BmdvY0KQabx3dojK9U8BeGEZU1sh1D5zwLQa-k1TODLdpqxuCAWgQaDqf4HMOVx7fLwRYG5IDGI3c1cDmRq5fxORguW7DGDQRIjZLwel-l8XUG06caYHJeePTuYTMneJPl4P-NCrlTmaN1KnidbJKcZMlCb8L5qveZ_B6Fn6unV_U5e6ee4CIpSJLiUf3DgLO7gD-zUHwI1fgsB4sM66mmlomL88at7whA-uLoYeZKQhUq5qstqAGeSAPU9C2DuGsgRyyn7yQ63nYIjFqkGr-hmRmERacBNCslLeUSFLRixgU7SMGMcYJTO5aMSTDZKw7lDjUTEF2Neq1KwU7OiQubW1SDAQkaFgmlnpHgz2y0gjo6WoHso2W-VxevLqdXhlrgfKPVyYdJqa66BZEhStf9aJ14qlNhFGDSh6oJLVD8Mxb-rjppwj9otYOowj-1coKbA6DQsk4HGOq3OQT4ydkOIatvrALonhqlQ99YsC659AIVapSKhEChgnaZEikiw8ZhFRfH49fD318Bk-PRjhUAL4Og4GbfD5odF4UyNwPlwEV9fGHauAKeaKWMHxh-DMU3uVYA5s5Ng354BWSHwhDtWAuYzbMUNTdDPYbUWhfxNL2ODaDyY6ML237eXo1VDT8aC_wC8Kpp2wnhtCyj-FaWuDBlc4afs5_A4xNahuBpMQ9IEIHh99YBEOBm_qcCveFwALZcvyjeQGSGcqqaOb5wejSXU6mE-WnGfN7DfjRCtAYNPXrltoPWmRru1TcDBf5c27LkFtoVdg0k_MsErqB7HDNn9DGB8SCGmgECe2NrrqIOpvubOS4NEAD6WCiY543v-8WOm1Nkt1TgEiYGdY2Qvyhev_vQ9hrOnZUAQzVTcQ7SvfTrPfr4k_PXly_OydLP_Hn7Pzo2wBncqBLUpeuw8k8zH0QgTiM1TLRrFpf_Cv53YlSV6xulJbXUreQgVS9U76ZuEusLlWHJdMu0PywH04FN7l6-d0y2cN4W2L3Sf74N8McHVr9Ra2Bpel7RVb0ldUEKurLcCrY1HW8cKu5LcB3aJ24qdeq5YKi8vB-wq0GLbWdtb1xa-AnuFvg7lBHoTHDfo86NGRiMq6c0zQlZdduqaOJ1npIiTwtKM0ySOM_XtEgZSYsmTVaClkyYbZDugCCTb_D48OCjmUgDP94feahUDfDK2_41CeobGvbTxxXf4hjjOIvxmqRJkkQsafINpqSoizImWQFwMeCviJzZSOl2pbc-lXJoDRwKbqy5HVLju4H5MME-HWyn9Has9MonvfUZ_wsZzAE1">