[llvm] [ARM][Codegen] Fix vector data miscompilation in arm32be (PR #105519)

Thu Sep 5 06:52:32 PDT 2024

Zhenhang1213 wrote:

> Sorry I haven't had a chance to look into this in detail yet. I wasn't really sure when I last looked what the best fix for that part was - I was thinking it might be worth just removing the `while (Src.getOpcode() == ARMISD::VECTOR_REG_CAST) Src = Src.getOperand(0);` if there isn't a better idea. As far as I understand the part for bitcasts is always correct (as the vmov element size is smaller than swapping around each lanes in the larger size doesn't alter the result).
> 
> If we want to keep that optimization, then IMO it might be good the have a combine that works a little differently - that looks through `bitcast(vector_reg_cast(vmovimm))`, reconstructs the constant as it should be and generates a new vmovimm with the correct value. It should handle more cases. In the meantime, so long as everyone is happy enough with the code being a bit slower we can just drop the while loop.

Hi, yesterday I try to make it more restictive 

```
SrcVT.getScalarSizeInBits() <= DstVT.getScalarSizeInBits()
```

to 
```
SrcVT.getScalarSizeInBits() < DstVT.getScalarSizeInBits()
```
and the case  xor_int64_ff0000ff0000ffff is satisfiying, right？

```
 CHECKBE-LABEL: xor_int64_ff0000ff0000ffff:
; CHECKBE:       @ %bb.0: @ %entry
; CHECKBE-NEXT:    vmov.i64 q1, #0xffffff0000ff
; CHECKBE-NEXT:    vrev64.32 q2, q1
; CHECKBE-NEXT:    veor q0, q0, q2
; CHECKBE-NEXT:    bx lr
```

https://github.com/llvm/llvm-project/pull/105519