[PATCH] D87174: [GlobalISel] Add `X,Y<dead> = G_UNMERGE Z` -> X = G_TRUNC Z

Fri Sep 4 18:30:32 PDT 2020

qcolombet added inline comments.

================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-llvm.amdgcn.image.store.2d.d16.ll:164
+  ; PACKED:   [[CONCAT_VECTORS1:%[0-9]+]]:_(<6 x s16>) = G_CONCAT_VECTORS [[BITCAST1]](<2 x s16>), [[BITCAST2]](<2 x s16>), [[DEF]](<2 x s16>)
+  ; PACKED:   [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS1]](<6 x s16>), 0
   ; PACKED:   [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[COPY8]](s32), [[COPY9]](s32)
----------------
@arsenm At first glance all the changes in AMDGPU seems fine but this one.

Looking at when the transformation kicks in, the input is:
```
  %16:_(<6 x s16>) = G_CONCAT_VECTORS %13:_(<2 x s16>), %14:_(<2 x s16>), %15:_(<2 x s16>)
  %3:_(<3 x s16>), %17:_(<3 x s16>) = G_UNMERGE_VALUES %16:_(<6 x s16>)
  G_INTRINSIC_W_SIDE_EFFECTS intrinsic(@llvm.amdgcn.image.store.2d), %3:_(<3 x s16>), 7, %1:_(s32), %2:_(s32), %0:_(<8 x s32>), 0, 0 :: (dereferenceable store 6 into custom "TargetCustom8", align 8)
  S_ENDPGM 0
```
And the output is:
```
  %16:_(<6 x s16>) = G_CONCAT_VECTORS %13:_(<2 x s16>), %14:_(<2 x s16>), %15:_(<2 x s16>)
  %19:_(s96) = G_BITCAST %16:_(<6 x s16>)
  %20:_(s48) = G_TRUNC %19:_(s96)
  %3:_(<3 x s16>) = G_BITCAST %20:_(s48)
  G_INTRINSIC_W_SIDE_EFFECTS intrinsic(@llvm.amdgcn.image.store.2d), %3:_(<3 x s16>), 7, %1:_(s32), %2:_(s32), %0:_(<8 x s32>), 0, 0 :: (dereferenceable store 6 into custom "TargetCustom8", align 8)
  S_ENDPGM 0
```

So far so good.

Then after the legalizer it is when we have the craziness:
```
  %16:_(<6 x s16>) = G_CONCAT_VECTORS %13:_(<2 x s16>), %14:_(<2 x s16>), %15:_(<2 x s16>)
  %19:_(s96) = G_BITCAST %16:_(<6 x s16>)
  %28:_(s32), %29:_(s32), %30:_(s32) = G_UNMERGE_VALUES %19:_(s96)
  %35:_(s32) = G_CONSTANT i32 16
  %36:_(s32) = G_LSHR %28:_, %35:_(s32)
  %37:_(s32) = G_LSHR %29:_, %35:_(s32)
  %46:_(s32) = G_CONSTANT i32 65535
  %49:_(s32) = COPY %28:_(s32)
  %40:_(s32) = G_AND %49:_, %46:_
  %48:_(s32) = COPY %36:_(s32)
  %41:_(s32) = G_AND %48:_, %46:_
  %42:_(s32) = G_SHL %41:_, %35:_(s32)
  %38:_(s32) = G_OR %40:_, %42:_
  %32:_(<2 x s16>) = G_BITCAST %38:_(s32)
  %47:_(s32) = COPY %29:_(s32)
  %43:_(s32) = G_AND %47:_, %46:_
  %44:_(s32) = G_CONSTANT i32 0
  %45:_(s32) = G_SHL %44:_, %35:_(s32)
  %39:_(s32) = G_OR %43:_, %45:_
  %33:_(<2 x s16>) = G_BITCAST %39:_(s32)
  %34:_(<6 x s16>) = G_CONCAT_VECTORS %32:_(<2 x s16>), %33:_(<2 x s16>), %15:_(<2 x s16>)
  %3:_(<3 x s16>) = G_EXTRACT %34:_(<6 x s16>), 0
  %21:_(<2 x s32>) = G_BUILD_VECTOR %1:_(s32), %2:_(s32)
  G_AMDGPU_INTRIN_IMAGE_STORE intrinsic(@llvm.amdgcn.image.store.2d), %3:_(<3 x s16>), 7, %21:_(<2 x s32>), $noreg, %0:_(<8 x s32>), 0, 0, 0 :: (dereferenceable store 6 into custom "TargetCustom8", align 8)
  S_ENDPGM 0
```

Do you think the AMDGPU target is missing something or should I disable the combine for vector types, at least for now?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D87174/new/

https://reviews.llvm.org/D87174