[PATCH] D14761: [X86][SSE] Detect AVG pattern during instruction combine for SSE2/AVX2/AVX512BW.

Simon Pilgrim via llvm-commits llvm-commits at lists.llvm.org
Sat Nov 21 07:21:07 PST 2015


RKSimon added a comment.

Out of curiosity - how well does this work with if InstCombiner::visitCallInst is used to convert _mm_avg_epu16 (etc.) calls to general IR? It should constant fold if possible - but could the lowering work if only one input is constant?


================
Comment at: test/CodeGen/X86/avg.ll:4
@@ +3,3 @@
+; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mcpu=x86-64 -mattr=+avx512bw | FileCheck %s --check-prefix=AVX512BW
+
+define void @avg_v4i8(<4 x i8> %a, <4 x i8> %b) {
----------------
AVX2/AVX512BW can share an additional AVX prefix - reduce test duplication:

FileCheck %s --check-prefix=AVX --check-prefix=AVX2
FileCheck %s --check-prefix=AVX --check-prefix=AVX512BW


================
Comment at: test/CodeGen/X86/avg.ll:5
@@ +4,3 @@
+
+define void @avg_v4i8(<4 x i8> %a, <4 x i8> %b) {
+; SSE2-LABEL: avg_v4i8
----------------
What does the code look like if we load the args instead of passing them in registers? Non-legal types in these cases often make the test cases less clear - in this case with all the pand/packuswb calls.


http://reviews.llvm.org/D14761





More information about the llvm-commits mailing list