[PATCH] D26790: [X86] Add a hasOneUse check to selectScalarSSELoad to keep the same load from being folded multiple times

Thu Nov 17 04:00:10 PST 2016

andreadb added inline comments.

================
Comment at: test/CodeGen/X86/vec_ss_load_fold.ll:384
 ; X32-NEXT:    movl {{[0-9]+}}(%esp), %eax
-; X32-NEXT:    movaps %xmm0, %xmm1
-; X32-NEXT:    minss (%eax), %xmm1
-; X32-NEXT:    maxss (%eax), %xmm0
-; X32-NEXT:    addps %xmm1, %xmm0
+; X32-NEXT:    movss {{.*#+}} xmm1 = mem[0],zero,zero,zero
+; X32-NEXT:    movaps %xmm0, %xmm2
----------------
zvi wrote:
> As the change in generated code shows, the tradeoff is increased register pressure. Do we have reason to believe that this change puts us on the preferred side? 
I agree that this change has the potential of increasing register pressure.
However, regalloc is smart enough to avoid inserting a spill slot if the spill candidate can be folded as a load into the user instructions. In this case, the load value would be feeding into a minss and a maxss. Those instructions both appear in the memory folding tables (see X86InstrInfo.cpp); the InlineSpiller would take advantage of this knowledge and fold the load instead of reserving a stack slot (and therefore inserting a reload).
Basically, my opinion is that this may not be a very problematic case because regalloc should be smart enough to "fall-back" to the other codegen if we run out of registers.

Note also that this is not the only place where we check for 'hasOneUse()' in this file (see for example `X86DAGToDAGISel::IsProfitableToFold`).

I hope this makes sense.

https://reviews.llvm.org/D26790