[PATCH, PowerPC] Fix and improve vector comparisons

Bill Schmidt wschmidt at linux.vnet.ibm.com
Fri Aug 1 09:19:00 PDT 2014


Hi,

I reviewed the patch prior to submission, and it LGTM.  Hal, if you
agree, do you feel this is ok for 3.5 backport as well?

Thanks,
Bill

On Fri, 2014-08-01 at 18:06 +0200, Ulrich Weigand wrote:
> 
> Hello,
> 
> I've run into a code-gen bug related to vector comparisons in the JIT code
> generated by mesa/llvmpipe on PowerPC.   The problem happens when expanding
> a "fcmp uge <4 x float>" instruction (which you don't usually get from C/C+
> + source code).  This is currently passed to common code to expand, which
> tests for unordered operands and then falls back to creating a "fcmp ge <4
> x float>".  This is processed by the back-end, but incorrectly.
> 
> In getVCmpInst, we run into this case:
> 
>     case ISD::SETGE:
>       if (VecVT == MVT::v16i8)
>         return PPC::VCMPGTSB;
>       else if (VecVT == MVT::v8i16)
>         return PPC::VCMPGTSH;
>       else if (VecVT == MVT::v4i32)
>         return PPC::VCMPGTSW;
>       else if (VecVT == MVT::v4f32)
>         return HasVSX ? PPC::XVCMPGTSP : PPC::VCMPGTFP;
> 
> so we return VCMPGTFP.  But then in SelectSETCC, we have:
> 
>       case ISD::SETGE:
>       case ISD::SETOGE:
>       case ISD::SETUGE: {
>         // Small optimization: Altivec provides a 'Vector Compare Greater
> Than
>         // or Equal To' instruction (vcmpgefp), so in this case there is no
>         // need for extra logic for the equal compare.
>         if (VecVT.getSimpleVT().isFloatingPoint()) {
>           return CurDAG->SelectNodeTo(N, VCmpInst, VecVT, LHS, RHS);
> 
> So even though the comment says we want to emit a Compare Greater Than or
> Equal To, we simply emit a Compare Greate Than.
> 
> Note that for the SETOGE case, getVCmpInst does indeed return PPC::VCMPGTFP
> -- but not for SETGE.
> 
> Also, falling back to common code to expand SETUGE is inefficient anyway:
> we could simply implement SETUGE by negating the result of a SETOLT, which
> in turn is simply a SETOGT with swapped inputs.  This results in just two
> instructions instead of 6 or 7 the common code expander produces.  (Similar
> optimizations apply to SETULE, SETUGT, SETULT of course.)
> 
> When looking at this code, I noticed a number of other cases where code
> quality could be improved as well: SETOLE can be implemented as SETOGE with
> swapped inputs, and for all the integer comparisons, we can do SET(U)GE by
> negating the result of a SET(U)LT instead of performing two separate
> comparisons for EQ and GT.
> 
> The current structure of SelectCC / getVCmpInst makes it somewhat hard to
> fix this bug and implement those enhancements, however.  The two issues
> that make the code more complex than it ought to be in my opinion are:
> getVCmpInst tries to handle floating-point and integer comparisons using
> the same logic, even though they are quite different; and, both getVCmpInst
> and its caller make some decisions independently, while implicitly relying
> on the other side to make the same decision.
> 
> The attached patch fixes all this by moving all decision logic to
> getVCmpInst, which gets two extra boolean outputs indicating to its caller
> whether it needs to swap the input operands and/or negate the result of the
> comparion.  In no case do we need to create two separate comparisons any
> more.
> 
> (See attached file: diff-llvm-vcmp)
> 
> Does this look OK?
> 
> 
> Mit freundlichen Gruessen / Best Regards
> 
> Ulrich Weigand
> 
> --
>   Dr. Ulrich Weigand | Phone: +49-7031/16-3727
>   STSM, GNU/Linux compilers and toolchain
>   IBM Deutschland Research & Development GmbH
>   Vorsitzende des Aufsichtsrats: Martina Koederitz | Geschäftsführung: Dirk
> Wittkopp
>   Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht
> Stuttgart, HRB 243294
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits





More information about the llvm-commits mailing list