[PATCH] D111960: [X86][AVX] Prefer VINSERTF128 over VPERM2F128 for 128->256 subvector concatenations

Pengfei Wang via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sun Oct 17 18:10:35 PDT 2021


pengfei added inline comments.


================
Comment at: llvm/test/CodeGen/X86/avx512-shuffles/partial_permute.ll:4419-4422
+; CHECK-FAST-NEXT:    vmovapd 32(%rdi), %ymm0
+; CHECK-FAST-NEXT:    vinsertf128 $1, (%rdi), %ymm0, %ymm1
+; CHECK-FAST-NEXT:    vmovapd {{.*#+}} ymm0 = [0,6,3,4]
+; CHECK-FAST-NEXT:    vpermi2pd (%rdi), %ymm1, %ymm0
----------------
How about this one, seems we have one more `vinsertf128` now?
https://simd.godbolt.org/z/4MMs99E7K


================
Comment at: llvm/test/CodeGen/X86/pr50823.ll:11-13
+; CHECK-NEXT:    vmovups (%rsi), %ymm0
+; CHECK-NEXT:    vinsertf128 $1, 32(%rsi), %ymm0, %ymm0
+; CHECK-NEXT:    vhaddps %ymm0, %ymm0, %ymm0
----------------
lebedev.ri wrote:
> RKSimon wrote:
> > lebedev.ri wrote:
> > > RKSimon wrote:
> > > > pengfei wrote:
> > > > > Is this a regression?
> > > > I don't believe so: https://simd.godbolt.org/z/rhrqsss5a - as I said in the summary, vinsertX128 tends to be cheaper than more general cross-lane shuffles.
> > > We were loading 128 bits, and then fold-loading 128 more bits,
> > > and now we load 256 bits, and then fold-load high 128 bits we just loaded, no?
> > > The `vinsertf128` should be dropped because it is a no-op.
> > Sorry - brain fog - the offset is +32 bytes, not +16bytes - so isn't that loading the lower half of the next <8 x float>?
> Err, right. But still, the first load is wider now.
> Is that intentional to break the false dep,
> or is this a demandedelts failure?
> IIRC demandedelts generally doesn't want to narrow the loads,
> so perhaps the fix should lie in not forming this YMM load in the first place.
I understand it now, thank you.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D111960/new/

https://reviews.llvm.org/D111960



More information about the llvm-commits mailing list