[PATCH] D14761: [X86][SSE] Detect AVG pattern during instruction combine for SSE2/AVX2/AVX512BW.

Sat Nov 21 07:21:07 PST 2015

RKSimon added a comment.

Out of curiosity - how well does this work with if InstCombiner::visitCallInst is used to convert _mm_avg_epu16 (etc.) calls to general IR? It should constant fold if possible - but could the lowering work if only one input is constant?

================
Comment at: test/CodeGen/X86/avg.ll:4
@@ +3,3 @@
+; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mcpu=x86-64 -mattr=+avx512bw | FileCheck %s --check-prefix=AVX512BW
+
+define void @avg_v4i8(<4 x i8> %a, <4 x i8> %b) {
----------------
AVX2/AVX512BW can share an additional AVX prefix - reduce test duplication:

FileCheck %s --check-prefix=AVX --check-prefix=AVX2
FileCheck %s --check-prefix=AVX --check-prefix=AVX512BW

================
Comment at: test/CodeGen/X86/avg.ll:5
@@ +4,3 @@
+
+define void @avg_v4i8(<4 x i8> %a, <4 x i8> %b) {
+; SSE2-LABEL: avg_v4i8
----------------
What does the code look like if we load the args instead of passing them in registers? Non-legal types in these cases often make the test cases less clear - in this case with all the pand/packuswb calls.

http://reviews.llvm.org/D14761