[PATCH] D132329: [X86][RFC] Using `__bf16` for AVX512_BF16 intrinsics
Phoebe Wang via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Wed Sep 21 01:48:55 PDT 2022
pengfei added inline comments.
================
Comment at: llvm/test/CodeGen/X86/avx512bf16-intrinsics-upgrade.ll:30
; X64-NEXT: kmovd %edi, %k1 # encoding: [0xc5,0xfb,0x92,0xcf]
-; X64-NEXT: vcvtne2ps2bf16 %zmm1, %zmm0, %zmm0 {%k1} {z} # encoding: [0x62,0xf2,0x7f,0xc9,0x72,0xc1]
+; X64-NEXT: vmovdqu16 %zmm0, %zmm0 {%k1} {z} # encoding: [0x62,0xf1,0xff,0xc9,0x6f,0xc0]
; X64-NEXT: retq # encoding: [0xc3]
----------------
RKSimon wrote:
> any chance we can recover the predicated instruction?
It's possible, e.g., iterate all users of the intrinsic, bitcast all the select operands as well; or add patterns for i16; or make vselect peek through bitcast etc.
But I think the small performance regression is not a critical requirement as the backward compatibility for the old intrinsics. It may not worth the code complexity.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D132329/new/
https://reviews.llvm.org/D132329
More information about the cfe-commits
mailing list