[Diffusion] rG96d3c82645cf: Revert "[SROA] `isVectorPromotionViable()`: memory intrinsics operate on…
Roman Lebedev via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Dec 19 16:08:42 PST 2022
lebedev.ri added a subscriber: llvm-commits.
lebedev.ri added a comment.
Thank you for taking a look!
In rG96d3c82645cf41a38543c5128cc15cda5761a76a#1157229 <https://reviews.llvm.org/rG96d3c82645cf41a38543c5128cc15cda5761a76a#1157229>, @nemanjai wrote:
> This seems to be a bug in the SDAG legalizer that is specific to big endian systems.
> We type-legalize the following SDAG:
>
> t5: i32,ch = load<(load (s32) from %ir.13, !tbaa !17)> t0, t4, undef:i64
> t6: i16 = truncate t5
> t8: i16 = add t6, Constant:i16<1>
> t9: v2i8 = bitcast t8
>
> As follows:
>
> Promote integer result: t8: i16 = add t6, Constant:i16<1>
> Creating new node: t55: i32 = add t5, Constant:i32<1>
> Widen node result 0: t9: v2i8 = bitcast t8
> Creating new node: t56: v4i32 = scalar_to_vector t55
>
> To produce:
>
> t5: i32,ch = load<(load (s32) from %ir.13, !tbaa !17)> t0, t4, undef:i64
> t55: i32 = add t5, Constant:i32<1>
> t56: v4i32 = scalar_to_vector t55
> t57: v16i8 = bitcast t56
>
> Now on a little-endian system, this isn't a problem. The lowest order 2 bytes of the load will end up in elements 0, 1 of the `v16i8`. However, on a big-endian system, this is a problem. The lowest order 2 bytes of the load end up in elements 2, 3 of the `v16i8` but the subsequent uses will use elements 0, 1. So the legalizer needs to either:
>
> - Shift the `i32` left by 16 bits
> - Produce a `v8i16` vector from the `i32`
> - Add a shuffle after the `scalar_to_vector` to put the bytes into the right place
>
> I think the lowest impact solution will be to produce a `v8i16` vector so I'll see if I can implement that.
BRANCHES
main
Users:
lebedev.ri (Author)
https://reviews.llvm.org/rG96d3c82645cf
More information about the llvm-commits
mailing list