[PATCH] D136046: [X86] `DAGTypeLegalizer::ModifyToType()`: when widening w/ zeros, insert into undef and `and`-mask the padding away

Mon Oct 17 07:27:14 PDT 2022

pengfei added inline comments.

================
Comment at: llvm/test/CodeGen/X86/masked_store.ll:6214-6224
 ; AVX512F-NEXT:    vmovdqa64 (%rsi), %zmm0
 ; AVX512F-NEXT:    vmovdqa64 64(%rsi), %zmm1
 ; AVX512F-NEXT:    vpxor %xmm2, %xmm2, %xmm2
-; AVX512F-NEXT:    vpcmpgtd 64(%rdi), %zmm2, %k0
-; AVX512F-NEXT:    movw $85, %ax
-; AVX512F-NEXT:    kmovw %eax, %k1
-; AVX512F-NEXT:    kandw %k1, %k0, %k0
-; AVX512F-NEXT:    kshiftlw $8, %k0, %k0
-; AVX512F-NEXT:    kshiftrw $8, %k0, %k1
 ; AVX512F-NEXT:    movw $21845, %ax ## imm = 0x5555
+; AVX512F-NEXT:    kmovw %eax, %k1
+; AVX512F-NEXT:    vpcmpgtd (%rdi), %zmm2, %k1 {%k1}
+; AVX512F-NEXT:    movw $85, %ax
----------------
lebedev.ri wrote:
> pengfei wrote:
> > The transform looks correct to me.
> > Just a tangential question: I didn't see any alignment info assigned to the pointers, why are we using `vmovdqa64` for `%trigger.ptr` but `vmovdqu32` for `dst`?
> > I have this question because we cannot load the garbage data into `zmm1` if it is not aligned, since it may cause memory fault.
> Because plain loads ended up being `align 128` (by default),
> while i have explicitly specified align of `i32 immarg 1` for `@llvm.masked.store`.
Thanks @lebedev.ri for the explanations. I can see the `@llvm.masked.store` now, but still not sure why `align 128` generates vmovdqa64. Should it require align 512?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136046/new/

https://reviews.llvm.org/D136046