[PATCH] D28537: [X86][AVX512] Add support for ASHR v2i64/v4i64 support without VLX

Simon Pilgrim via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jan 12 07:29:33 PST 2017


RKSimon added inline comments.


================
Comment at: test/CodeGen/X86/avx512-cvt.ll:943
 ; KNL-NEXT:    vcmpltps %xmm0, %xmm1, %xmm0
-; KNL-NEXT:    vinsertps {{.*#+}} xmm0 = xmm0[0,1],zero,xmm0[1]
+; KNL-NEXT:    vpmovzxdq {{.*#+}} xmm0 = xmm0[0],zero,xmm0[1],zero
+; KNL-NEXT:    vpsllq $32, %xmm0, %xmm0
----------------
craig.topper wrote:
> craig.topper wrote:
> > delena wrote:
> > > I think that this issue should be resolved, or at least understood prior to commit.
> > FYI, this test case does something really terrible when avx512vl is enabled without avx512dq. Its get completely scalarized due to lack of vpmovm2d.
> So what's happening here is that the setcc from the cmp gets its result type converted from v2i1 to v2i32 then to v2i64. Then the operands get legalized from v2i32 to v4i32 and we end up with (v2i64 sign_extend (v2i32 extract_subvector (v4i32 setcc)). The sign_extend becomes a sign_extend_inreg that is now implemented with a v2i64 vshli(32) and vsrai(32). Previously because v2i64 vsrai wasn't legal we lowered it to a v4i32 vsrai(31) and a shuffle. There was another shuffle later that wanted elements from this shuffle that didn't come from the VSRAI so it got removed. And I think the vshli(32) was able to get combined with other shuffles to produce the INSERTPS.
> 
> So the main issue here is that the setcc legalizing for this produces a v2i64 type that we don't need and aren't able to recover from very well.
Thanks for the analysis, Craig - its seems to be a general problem and we're just lucky that with current codegen it only costs us an insertps and nothing more extravagant, I'm sure there are plenty of other examples with other non-legal types that we just haven't looked at. 

This might turn out to be a massive yak shaving issue so in the meantime I've created D28604 so that we can at least support variable shifts .


Repository:
  rL LLVM

https://reviews.llvm.org/D28537





More information about the llvm-commits mailing list