[PATCH] D119654: [SDAG] enable binop identity constant folds for add/sub
Simon Pilgrim via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Feb 22 08:29:01 PST 2022
RKSimon added inline comments.
================
Comment at: llvm/test/CodeGen/X86/avx512-intrinsics-upgrade.ll:4241
+; X86-NEXT: vpaddq %zmm0, %zmm1, %zmm1 {%k1} ## encoding: [0x62,0xf1,0xf5,0x49,0xd4,0xc8]
+; X86-NEXT: vmovdqa64 %zmm1, %zmm0 ## encoding: [0x62,0xf1,0xfd,0x48,0x6f,0xc1]
; X86-NEXT: retl ## encoding: [0xc3]
----------------
RKSimon wrote:
> LuoYuanke wrote:
> > LuoYuanke wrote:
> > > This vmovdqa64 is emitted because the function need to return value by zmm0. Not sure if it is a regression.
> > It seems fold select to its previous operands (psrl) is better, because the add operands is communitive so there is more chance to meet the hint (return register) of register allocator.
> These adds were just used for simplicity to make the result dependent on all 3 intrinsics.
>
> We'd avoid all of the intrinsics-upgrade changes if we just changed these add ops to something else, preferably something that we're not going to add to foldSelectWithIdentityConstant in the future.
>
> Alternatively we split these tests into the 3 normal / {k} / {k}{z} variants
@LuoYuanke Something that might work is to return a { <8 x i64>, <8 x i64>, <8 x i64> } structure : https://gcc.godbolt.org/z/39ahrqM7E
```
define { <8 x i64>, <8 x i64>, <8 x i64> } @test_int_x86_avx512_mask_psrl_qi_512(<8 x i64> %x0, i32 %x1, <8 x i64> %x2, i8 %x3) {
%res = call <8 x i64> @llvm.x86.avx512.mask.psrl.qi.512(<8 x i64> %x0, i32 4, <8 x i64> %x2, i8 %x3)
%res1 = call <8 x i64> @llvm.x86.avx512.mask.psrl.qi.512(<8 x i64> %x0, i32 5, <8 x i64> %x2, i8 -1)
%res2 = call <8 x i64> @llvm.x86.avx512.mask.psrl.qi.512(<8 x i64> %x0, i32 6, <8 x i64> zeroinitializer, i8 %x3)
%r0 = insertvalue { <8 x i64>, <8 x i64>, <8 x i64> } poison, <8 x i64> %res, 0
%r1 = insertvalue { <8 x i64>, <8 x i64>, <8 x i64> } %r0, <8 x i64> %res1, 1
%r2 = insertvalue { <8 x i64>, <8 x i64>, <8 x i64> } %r1, <8 x i64> %res2, 2
ret { <8 x i64>, <8 x i64>, <8 x i64> } %r2
}
declare <8 x i64> @llvm.x86.avx512.mask.psrl.qi.512(<8 x i64>, i32, <8 x i64>, i8)
test_int_x86_avx512_mask_psrl_qi_512: # @test_int_x86_avx512_mask_psrl_qi_512
vmovdqa64 %zmm1, %zmm3 # encoding: [0x62,0xf1,0xfd,0x48,0x6f,0xd9]
kmovw %esi, %k1 # encoding: [0xc5,0xf8,0x92,0xce]
vpsrlq $4, %zmm0, %zmm3 {%k1} # encoding: [0x62,0xf1,0xe5,0x49,0x73,0xd0,0x04]
vpsrlq $5, %zmm0, %zmm1 # encoding: [0x62,0xf1,0xf5,0x48,0x73,0xd0,0x05]
vpsrlq $6, %zmm0, %zmm2 {%k1} {z} # encoding: [0x62,0xf1,0xed,0xc9,0x73,0xd0,0x06]
vmovdqa64 %zmm3, %zmm0 # encoding: [0x62,0xf1,0xfd,0x48,0x6f,0xc3]
retq # encoding: [0xc3]
```
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D119654/new/
https://reviews.llvm.org/D119654
More information about the llvm-commits
mailing list