[PATCH][x86] Teach how to fold target specific dags for packed shifts by immedate count into build_vector.

Fri Dec 27 10:50:28 PST 2013

The patch looks really good. Please commit. Thank you Andrea!

Sent from my iPhone

> On Dec 27, 2013, at 9:52, Andrea Di Biagio <andrea.dibiagio at gmail.com> wrote:
> 
> Hi Nadav,
> 
>> On Thu, Dec 26, 2013 at 6:57 AM, Nadav Rotem <nrotem at apple.com> wrote:
>> Overall the patch LGTM.  Do we have something like that for the non-target-specific code?  If not then we should move it there and figure out a way to handle the target-specific x86 nodes.
> 
> Thanks for the feedback!
> I created a new patch that fixes the original poor code-gen bug in the
> target independent part.
> I managed to add a new rule that teaches the DAGCombiner how to fold a
> sext_inreg of a build_vector of ConstantSDNodes (or UNDEFs) into a
> build_vector.
> 
> The differences with respect to my previous patch are:
> 1) this new patch does not handle the target specific packed vector
> shift nodes;
> 2) I had to touch test test/CodeGen/X86/blend-msb.ll (see below).
> 
> About point 1.
> For simplicity I plan to send a separate patch that adds target
> specific combine rules to fold packed vector shifts by immediate count
> (VSHLI/VSRLI/VSRAI) into a build_vector in the case where the vector
> to shift is a vector of constants (or undefs).
> 
> About point 2.
> With this fix, we now generate a slightly different (but still valid)
> mask for the 'blendvps' instruction expected by test
> test/CodeGen/X86/blend-msb.ll.
> 
> -;CHECK: movl $-2147483648
> +;CHECK: movl $-1
> 
> The test originally expected a Mask of $-2147483648 in input to the
> 'blendvps'; with my patch however however we produce $-1.
> The two masks are both ok; the difference is that before we were able
> to simplify the constant value for the mask knowing that 'blendvps'
> only cares about specific bits in the sequence.
> 
> The target specific VSELECT combine uses method SimplifyDemandedBits
> to 'simplify' the value of the mask in input to the blend.
> It looks like method 'TargetLowering::SimplifyDemandedBits' knows how
> to simplify the Constant Mask in input to a vector SELECT node when it
> is generated by a SIGN_EXTEND_INREG; it doesn't seem to know how to
> handle the case where the Mask in input is a BUILD_VECTOR.
> I decided for now to fix the test modifying the expected number in
> input to the movl.
> 
>> +// Return true if this is a BUILD_VECTOR of ConstantSDNodes.
>> +// This is a helper function used by `getTargetVShiftByConstNode`.
>> +static bool isBuildVectorOfConstantSDNodes(SDValue N) {
>> +  if (N->getOpcode() != ISD::BUILD_VECTOR)
>> +    return false;
>> +
>> +  for (unsigned i = 0, e = N->getNumOperands(); i != e; ++i)
>> +    if (!isa<ConstantSDNode>(N->getOperand(i)))
>> +      return false;
>> +  return true;
>> +}
>> +
>> 
>> You can also handle UNDEFs here.
> 
> I moved this function in the ISD namespace into
> lib/CodeGen/SelectionDAG/SelectionDAG.cpp and now it also accepts
> UNDEFs.
> 
> Please let me know what you think about this new patch and if it is ok
> to submit.
> 
> Thanks!
> Andrea
> <patchv2.diff>