[PATCH] D77827: [AMDGCN] Run LoadStoreVectorizer before CodeGenPrepare

Thu Apr 9 15:30:18 PDT 2020

rampitec added a comment.

In fact we do it on a DAG level, look at t14 transformed into t18:

  Initial selection DAG: %bb.0 'test:'
  SelectionDAG has 18 nodes:
    t0: ch = EntryToken
    t3: i64 = Constant<0>
      t2: i64,ch = CopyFromReg t0, Register:i64 %3
    t5: i64,ch = load<(dereferenceable invariant load 8, align 16, addrspace 4)> t0, t2, undef:i64
        t6: i64,ch = merge_values t5, t5:1
              t11: i64 = llvm.amdgcn.dispatch.ptr TargetConstant:i64<1116>
            t13: i64 = add nuw t11, Constant:i64<4>
          t14: i16,ch = load<(load 2 from %ir.4, align 4, !tbaa !10, addrspace 4)> t0, t13, undef:i64
        t15: i32 = zero_extend t14
          t8: i64 = llvm.amdgcn.kernarg.segment.ptr TargetConstant:i64<1636>
        t9: i64,ch = load<(dereferenceable invariant load 8 from %ir..kernarg.offset.cast, align 16, addrspace 4)> t0, t8, undef:i64
      t16: ch = store<(store 4 into %ir..load, !tbaa !19, addrspace 1)> t6:1, t15, t9, undef:i64
    t17: ch = ENDPGM t16

  Optimized lowered selection DAG: %bb.0 'test:'
  SelectionDAG has 12 nodes:
    t0: ch = EntryToken
            t11: i64 = llvm.amdgcn.dispatch.ptr TargetConstant:i64<1116>
          t13: i64 = add nuw t11, Constant:i64<4>
        t18: i32,ch = load<(load 2 from %ir.4, align 4, !tbaa !10, addrspace 4), zext from i16> t0, t13, undef:i64
          t8: i64 = llvm.amdgcn.kernarg.segment.ptr TargetConstant:i64<1636>
        t9: i64,ch = load<(dereferenceable invariant load 8 from %ir..kernarg.offset.cast, align 16, addrspace 4)> t0, t8, undef:i64
      t19: ch = store<(store 4 into %ir..load, !tbaa !19, addrspace 1)> t0, t18, t9, undef:i64
    t17: ch = ENDPGM t19

IR widening was done by your commit https://reviews.llvm.org/rG90083d3088ae, there you mention "the very weak load merging the DAG does".
However, when I turned off the option -amdgpu-codegenprepare-widen-constant-loads the only test which has failed was widen_extending_scalar_loads.ll for obvious reason.

Why don't I just turn off this option? It does not seem to do anything good anymore.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77827/new/

https://reviews.llvm.org/D77827