[PATCH] D138874: [InstCombine] canonicalize trunc + insert as bitcast + shuffle, part 3
Dave Green via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Dec 1 09:07:36 PST 2022
dmgreen added a comment.
> I tried pushing a couple of tests through AArch64 codegen, and see diffs like this:
>
> lsr x8, x0, #48
> mov v0.h[3], w8
> ->
> fmov d1, x0
> mov v0.h[3], v1.h[3]
>
> Does that seem neutral? If not, we could try harder to fold back to an insertelt in codegen or convert to a target-dependent transform in VectorCombine instead of a generic fold here.
That would come down to the difference between shift (cheap) and lane mov (should be cheapish too). I don't think there's a lot in it.
https://godbolt.org/z/haP87afo9 has some other cases from the tests here. bitcast can be awkward if is secretly includes an extend, which is more difficult than it should be for MVE where most vectors are assumed to be 128bit. We've had problem in the past with instcombine transforming shuffles where it isn't helpful, and I think we still have some. Like I said I don't want to block anything, but this doesn't seem very general, and might be better in the backend or to be cost modelled. (I'm not sure we have sensible costs for bitcasts though. They don't often come up from the vectorizers).
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D138874/new/
https://reviews.llvm.org/D138874
More information about the llvm-commits
mailing list