[PATCH] D87174: [GlobalISel] Add `X,Y<dead> = G_UNMERGE Z` -> X = G_TRUNC Z
Quentin Colombet via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Sep 4 18:30:32 PDT 2020
qcolombet added inline comments.
================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-llvm.amdgcn.image.store.2d.d16.ll:164
+ ; PACKED: [[CONCAT_VECTORS1:%[0-9]+]]:_(<6 x s16>) = G_CONCAT_VECTORS [[BITCAST1]](<2 x s16>), [[BITCAST2]](<2 x s16>), [[DEF]](<2 x s16>)
+ ; PACKED: [[EXTRACT:%[0-9]+]]:_(<3 x s16>) = G_EXTRACT [[CONCAT_VECTORS1]](<6 x s16>), 0
; PACKED: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[COPY8]](s32), [[COPY9]](s32)
----------------
@arsenm At first glance all the changes in AMDGPU seems fine but this one.
Looking at when the transformation kicks in, the input is:
```
%16:_(<6 x s16>) = G_CONCAT_VECTORS %13:_(<2 x s16>), %14:_(<2 x s16>), %15:_(<2 x s16>)
%3:_(<3 x s16>), %17:_(<3 x s16>) = G_UNMERGE_VALUES %16:_(<6 x s16>)
G_INTRINSIC_W_SIDE_EFFECTS intrinsic(@llvm.amdgcn.image.store.2d), %3:_(<3 x s16>), 7, %1:_(s32), %2:_(s32), %0:_(<8 x s32>), 0, 0 :: (dereferenceable store 6 into custom "TargetCustom8", align 8)
S_ENDPGM 0
```
And the output is:
```
%16:_(<6 x s16>) = G_CONCAT_VECTORS %13:_(<2 x s16>), %14:_(<2 x s16>), %15:_(<2 x s16>)
%19:_(s96) = G_BITCAST %16:_(<6 x s16>)
%20:_(s48) = G_TRUNC %19:_(s96)
%3:_(<3 x s16>) = G_BITCAST %20:_(s48)
G_INTRINSIC_W_SIDE_EFFECTS intrinsic(@llvm.amdgcn.image.store.2d), %3:_(<3 x s16>), 7, %1:_(s32), %2:_(s32), %0:_(<8 x s32>), 0, 0 :: (dereferenceable store 6 into custom "TargetCustom8", align 8)
S_ENDPGM 0
```
So far so good.
Then after the legalizer it is when we have the craziness:
```
%16:_(<6 x s16>) = G_CONCAT_VECTORS %13:_(<2 x s16>), %14:_(<2 x s16>), %15:_(<2 x s16>)
%19:_(s96) = G_BITCAST %16:_(<6 x s16>)
%28:_(s32), %29:_(s32), %30:_(s32) = G_UNMERGE_VALUES %19:_(s96)
%35:_(s32) = G_CONSTANT i32 16
%36:_(s32) = G_LSHR %28:_, %35:_(s32)
%37:_(s32) = G_LSHR %29:_, %35:_(s32)
%46:_(s32) = G_CONSTANT i32 65535
%49:_(s32) = COPY %28:_(s32)
%40:_(s32) = G_AND %49:_, %46:_
%48:_(s32) = COPY %36:_(s32)
%41:_(s32) = G_AND %48:_, %46:_
%42:_(s32) = G_SHL %41:_, %35:_(s32)
%38:_(s32) = G_OR %40:_, %42:_
%32:_(<2 x s16>) = G_BITCAST %38:_(s32)
%47:_(s32) = COPY %29:_(s32)
%43:_(s32) = G_AND %47:_, %46:_
%44:_(s32) = G_CONSTANT i32 0
%45:_(s32) = G_SHL %44:_, %35:_(s32)
%39:_(s32) = G_OR %43:_, %45:_
%33:_(<2 x s16>) = G_BITCAST %39:_(s32)
%34:_(<6 x s16>) = G_CONCAT_VECTORS %32:_(<2 x s16>), %33:_(<2 x s16>), %15:_(<2 x s16>)
%3:_(<3 x s16>) = G_EXTRACT %34:_(<6 x s16>), 0
%21:_(<2 x s32>) = G_BUILD_VECTOR %1:_(s32), %2:_(s32)
G_AMDGPU_INTRIN_IMAGE_STORE intrinsic(@llvm.amdgcn.image.store.2d), %3:_(<3 x s16>), 7, %21:_(<2 x s32>), $noreg, %0:_(<8 x s32>), 0, 0, 0 :: (dereferenceable store 6 into custom "TargetCustom8", align 8)
S_ENDPGM 0
```
Do you think the AMDGPU target is missing something or should I disable the combine for vector types, at least for now?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D87174/new/
https://reviews.llvm.org/D87174
More information about the llvm-commits
mailing list