[llvm] [ARM][Codegen] Fix vector data miscompilation in arm32be (PR #105519)

Mon Sep 2 08:08:39 PDT 2024

Zhenhang1213 wrote:

> Hi - I think all the changes you have so far are good, but some of the tests are still wrong and it would be good to fix PerformBITCASTCombine too.
> 
> If you consider the xor_int64_ff0000ff0000ffff case:
> 
> ```
> define arm_aapcs_vfpcc <2 x i64> @xor_int64_ff0000ff0000ffff(<2 x i64> %a) {
> entry:
>   %b = xor <2 x i64> %a, <i64 -72056498821201921, i64 -72056498821201921>
>   ret <2 x i64> %b
> }
> ```
> 
> A series of transforms, which I believe are all OK, lead to:
> 
> ```
>     t3: v2i64 = bitcast t2
>         t16: v2i64 = ARMISD::VMOVIMM TargetConstant:i32<7737>
>       t17: v4i32 = ARMISD::VECTOR_REG_CAST t16
>     t14: v2i64 = bitcast t17
>   t6: v2i64 = xor t3, t14
> t7: v2f64 = bitcast t6
> ```
> 
> The `bitcast ( VECTOR_REG_CAST ( VMOVIMM ))` is converted to `VECTOR_REG_CAST ( VMOVIMM )` as it is believed that the bitcast does not matter as each lane of the input will be identical. It only looks at the bitcast dst type and the vmovimm type, ignoring that the VECTOR_REG_CAST makes the bitcast important.
> 
> It ends up with this, that has the top/bottom half of each i64 in the wrong order.
> 
> ```
>    t16: v2i64 = ARMISD::VMOVIMM TargetConstant:i32<7737>
>  t6: v2i64 = xor t3, t16
> 
>   vmov.i64        q8, #0xffffff0000ff
>   veor    q0, q0, q8
> ```

I have modify the tests,  OK?

https://github.com/llvm/llvm-project/pull/105519