[PATCH] D138874: [InstCombine] canonicalize trunc + insert as bitcast + shuffle, part 3
Sanjay Patel via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Dec 7 09:01:58 PST 2022
spatel added a comment.
In D138874#3970326 <https://reviews.llvm.org/D138874#3970326>, @RKSimon wrote:
> Also, which option will make it easier to address the remaining missing GVN handling?
Having it in InstCombine will definitely be easier than VectorCombine with respect to phase ordering/dependencies on other passes.
In the motivating example, we don't get the folding opportunity until late because it requires inlining to see the pattern. That means we wouldn't do this until near the very end of optimization (and so no subsequent GVN).
I wasn't aware of the SDAG shuffle problems that @dmgreen noted for Thumb/MVE, so I was looking at that a bit closer. Even without this patch, we've already uncovered some awful codegen with the earlier folds like:
define <8 x i16> @low_index_longer_length_poison_basevec_i64(i64 %x) {
%t = trunc i64 %x to i16
%r = insertelement <8 x i16> poison, i16 %t, i64 0
ret <8 x i16> %r
}
$ llc -o - -mtriple=thumbv8.1-m.main -mattr=+mve.fp -float-abi=hard
vmov.16 q0[0], r0
-->
define <8 x i16> @low_index_longer_length_poison_basevec_i64(i64 %x) {
%vec.x = bitcast i64 %x to <4 x i16>
%r = shufflevector <4 x i16> %vec.x, <4 x i16> poison, <8 x i32> <i32 0, i32 undef, i32 undef, i32 undef, i32 undef, i32 undef, i32 undef, i32 undef>
ret <8 x i16> %r
}
sub sp, #8
strd r0, r1, [sp]
mov r0, sp
vldrh.u32 q0, [r0]
vmov r2, r3, d0
vmov r0, r1, d1
vmov.16 q0[0], r2
vmov.16 q0[1], r3
vmov.16 q0[2], r0
vmov.16 q0[3], r1
add sp, #8
The reason for that is what seems like a bug in SelectionDAGBuilder. It creates these nodes for the bitcast + shuffle sequence:
Creating new node: t5: i64 = build_pair t2, t4
Creating new node: t6: v4i16 = bitcast t5
Creating new node: t7: v4i16 = undef
Creating new node: t8: v8i16 = concat_vectors t6, undef:v4i16
But that's discarding information - the upper 48-bits of the build_pair are zapped to undef by the shuffle in IR, but that's gone with the translation to concat_vectors.
I'll try to fix that.
The good news is that potential regressions like above have been in main for almost a week now, and I haven't seen any bug reports/complaints yet.
So maybe this kind of IR pattern doesn't happen much in real code where it would be noticed.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D138874/new/
https://reviews.llvm.org/D138874
More information about the llvm-commits
mailing list