[PATCH] D140069: [DAGCombiner] Scalarize vectorized loads that are splatted

Luke Lau via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Dec 15 07:35:55 PST 2022


luke added inline comments.


================
Comment at: llvm/test/CodeGen/AMDGPU/vector_shuffle.packed.ll:138-142
+; GFX9-NEXT:    global_load_ushort v0, v[0:1], off offset:6
+; GFX9-NEXT:    s_mov_b32 s4, 0x5040100
 ; GFX9-NEXT:    s_waitcnt vmcnt(0)
+; GFX9-NEXT:    v_perm_b32 v0, v0, v0, s4
 ; GFX9-NEXT:    s_setpc_b64 s[30:31]
----------------
I presume this a regression since even though it's loading smaller sizes, it has to do more twiddling.


================
Comment at: llvm/test/CodeGen/PowerPC/canonical-merge-shuffles.ll:1149-1162
+; P8-AIX-32-LABEL: testSplati64_1:
+; P8-AIX-32:       # %bb.0: # %entry
+; P8-AIX-32-NEXT:    lwz r4, L..C4(r2) # %const.0
+; P8-AIX-32-NEXT:    lwz r5, 12(r3)
+; P8-AIX-32-NEXT:    lwz r3, 8(r3)
+; P8-AIX-32-NEXT:    stw r5, -16(r1)
+; P8-AIX-32-NEXT:    stw r3, -32(r1)
----------------
These extra lines replace the old `P8-AIX` prefixed checks that must have been left behind


================
Comment at: llvm/test/CodeGen/X86/half.ll:1342-1344
+; BWON-F16C-NEXT:    vpinsrw $0, 8(%rdi), %xmm0, %xmm0
+; BWON-F16C-NEXT:    vpshuflw {{.*#+}} xmm0 = xmm0[0,0,0,0,4,5,6,7]
+; BWON-F16C-NEXT:    vpshufd {{.*#+}} xmm0 = xmm0[0,0,0,0]
----------------
@pengfei This looks like a regression, the scalarized load t18 gets selected as `VPINSRWrm`

```
  t0: ch,glue = EntryToken
                    t2: i64,ch = CopyFromReg t0, Register:i64 %0
                  t17: i64 = add t2, Constant:i64<8>
                t18: f16,ch = load<(load (s16) from %ir.p + 8, align 8)> t0, t17, undef:i64
              t21: v8f16 = scalar_to_vector t18
            t23: v8i16 = bitcast t21
          t28: v8i16 = X86ISD::PSHUFLW t23, TargetConstant:i8<0>
        t29: v4i32 = bitcast t28
      t30: v4i32 = X86ISD::PSHUFD t29, TargetConstant:i8<0>
    t36: v8f16 = bitcast t30
  t10: ch,glue = CopyToReg t0, Register:v8f16 $xmm0, t36
  t11: ch = X86ISD::RET_FLAG t10, TargetConstant:i32<0>, Register:v8f16 $xmm0, t10:1
```


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D140069/new/

https://reviews.llvm.org/D140069



More information about the llvm-commits mailing list