[PATCH] D44269: [X86] Remove sse41 specific code from lowering v16i8 multiply

Craig Topper via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Mar 8 17:47:09 PST 2018


craig.topper added inline comments.


================
Comment at: test/CodeGen/X86/vector-mul.ll:968
+; X86-NEXT:    movdqa {{.*#+}} xmm2 = [0,1,3,7,15,31,63,127,0,1,3,7,15,31,63,127]
+; X86-NEXT:    punpckhbw {{.*#+}} xmm2 = xmm2[8,8,9,9,10,10,11,11,12,12,13,13,14,14,15,15]
+; X86-NEXT:    pmovzxbw {{.*#+}} xmm1 = xmm0[0],zero,xmm0[1],zero,xmm0[2],zero,xmm0[3],zero,xmm0[4],zero,xmm0[5],zero,xmm0[6],zero,xmm0[7],zero
----------------
RKSimon wrote:
> Why wasn't this constant folded?
At one point that constant pool entry was used by two vector shuffles and I guess we refused to fold it due to multiple uses? Late one shuffle became UNPCKH and the other became zero_extend_vector_in_reg. The DAG combine for zero_extend_vector_in_reg was perfectly happy to overlook the multiple uses and constant fold it. This is the LCPI on the pmullw. This dropped the usage count on the original constant pool but it was too late to trigger the fold.

Should we stop the zero_extend_vector_in_reg from constant folding multiple uses?


https://reviews.llvm.org/D44269





More information about the llvm-commits mailing list