[PATCH] D14761: [X86][SSE] Detect AVG pattern during instruction combine for SSE2/AVX2/AVX512BW.
Cong Hou via llvm-commits
llvm-commits at lists.llvm.org
Mon Nov 23 11:28:35 PST 2015
congh added inline comments.
================
Comment at: lib/Target/X86/X86ISelLowering.cpp:25286
@@ +25285,3 @@
+ VT.getVectorElementType() == MVT::i16) &&
+ InVT.getVectorElementType() == MVT::i32 && isPowerOf2_32(NumElems)))
+ return SDValue();
----------------
RKSimon wrote:
> Do the extended vector element types have to be i32? I understood it as it could be anything that was greater in width than the source.
>
> AMD APM v4 description for PAVGB:
>
> > An average is computed by adding pairs of operands, adding 1 to a 9-bit temporary sum, and rightshifting the temporary sum by one bit position.
>
>
>
>
You are right. I have updated this part to let this type be larger than i8/i16 in case in the future we do type demotion on intermediate types.
================
Comment at: lib/Target/X86/X86ISelLowering.cpp:25309
@@ +25308,3 @@
+ //
+ // In AVX512, the last instruction can also be a trunc store.
+
----------------
RKSimon wrote:
> If this is supposed to be FIXME comment please mark it as such.
This is fixed already in this patch.
================
Comment at: lib/Target/X86/X86ISelLowering.cpp:25347
@@ +25346,3 @@
+ // element is in the range [1, 256].
+ if (IsConstVectorInRange(Operands[1], 1, 256) &&
+ Operands[0].getOpcode() == ISD::ZERO_EXTEND &&
----------------
RKSimon wrote:
> Shouldn't the upper limit be 65536 for pavgw?
Good catch! Corrected. Thanks!
http://reviews.llvm.org/D14761
More information about the llvm-commits
mailing list