[PATCH] D159267: [AArch64] Remove copy instruction between uaddlv and dup
Dave Green via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Sep 7 23:00:49 PDT 2023
dmgreen added a comment.
In D159267#4641051 <https://reviews.llvm.org/D159267#4641051>, @efriedma wrote:
> re: the big-endian stuff I mentioned on the other ticket... it looks like it isn't a regression, but my concern is the code generated for ctpop_i32 for a big-endian target. uaddlv v16i8 produces a result in h0 (element 0 of an 8 x i16), but we then access it as s0 (element 0 of a 4 x i32) without a bitcast. So I think the bits end up in the wrong place?
I think it's the other way around (hopefully I have it the right way around, BE can be confusing). A bitcast would swap the lane indices (it acts as a load and a store). Otherwise lane 0 is the lowest lane in both llvmir and the neon registers.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D159267/new/
https://reviews.llvm.org/D159267
More information about the llvm-commits
mailing list