[PATCH] D14761: [X86][SSE] Detect AVG pattern during instruction combine for SSE2/AVX2/AVX512BW.

Cong Hou via llvm-commits llvm-commits at lists.llvm.org
Mon Nov 23 11:28:35 PST 2015


congh added inline comments.

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:25286
@@ +25285,3 @@
+         VT.getVectorElementType() == MVT::i16) &&
+        InVT.getVectorElementType() == MVT::i32 && isPowerOf2_32(NumElems)))
+    return SDValue();
----------------
RKSimon wrote:
> Do the extended vector element types have to be i32? I understood it as it could be anything that was greater in width than the source.
> 
> AMD APM v4 description for PAVGB:
> 
> > An average is computed by adding pairs of operands, adding 1 to a 9-bit temporary sum, and rightshifting the temporary sum by one bit position.
> 
> 
> 
> 
You are right. I have updated this part to let this type be larger than i8/i16 in case in the future we do type demotion on intermediate types.

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:25309
@@ +25308,3 @@
+  //
+  // In AVX512, the last instruction can also be a trunc store.
+
----------------
RKSimon wrote:
> If this is supposed to be FIXME comment please mark it as such.
This is fixed already in this patch.

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:25347
@@ +25346,3 @@
+  // element is in the range [1, 256].
+  if (IsConstVectorInRange(Operands[1], 1, 256) &&
+      Operands[0].getOpcode() == ISD::ZERO_EXTEND &&
----------------
RKSimon wrote:
> Shouldn't the upper limit be 65536 for pavgw?
Good catch! Corrected. Thanks!


http://reviews.llvm.org/D14761





More information about the llvm-commits mailing list