[llvm] r174660 - Constrain PowerPC autovectorization to fix bug 15041.

Thu Feb 7 13:23:51 PST 2013

----- Original Message -----
> From: "Bill Schmidt" <wschmidt at linux.vnet.ibm.com>
> To: llvm-commits at cs.uiuc.edu
> Sent: Thursday, February 7, 2013 2:33:57 PM
> Subject: [llvm] r174660 - Constrain PowerPC autovectorization to fix bug 15041.
> 
> Author: wschmidt
> Date: Thu Feb  7 14:33:57 2013
> New Revision: 174660
> 
> URL: http://llvm.org/viewvc/llvm-project?rev=174660&view=rev
> Log:
> Constrain PowerPC autovectorization to fix bug 15041.
> 
> Certain vector operations don't vectorize well with the current
> PowerPC implementation.  Element insert/extract performs poorly
> without VSX support because Altivec requires going through memory.
> SREM, UREM, and VSELECT all produce bad scalar code.
> 
> There's a lot of work to do for the cost model before
> autovectorization will be tuned well, and this is not an attempt to
> address the larger problem.
> 
> Modified:
>     llvm/trunk/lib/Target/PowerPC/PPCTargetTransformInfo.cpp
> 
> Modified: llvm/trunk/lib/Target/PowerPC/PPCTargetTransformInfo.cpp
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCTargetTransformInfo.cpp?rev=174660&r1=174659&r2=174660&view=diff
> ==============================================================================
> --- llvm/trunk/lib/Target/PowerPC/PPCTargetTransformInfo.cpp
> (original)
> +++ llvm/trunk/lib/Target/PowerPC/PPCTargetTransformInfo.cpp Thu Feb
>  7 14:33:57 2013
> @@ -194,6 +194,25 @@ unsigned PPCTTI::getVectorInstrCost(unsi
>                                      unsigned Index) const {
>    assert(Val->isVectorTy() && "This must be a vector type");
>  
> +  const unsigned Awful = 1000;

Does it need to be this high? I would think that:

For extract: Cost == 1 (vector store) + 1 (scalar load)
For insert: Cost == 1 (vector store) + 1 (scalar store) + 1 (vector load)
For srem/urem: Cost == 1 (vector store) + N (scalar loads) + N*O (operation costs) + N (scalar stores) + 1 (vector load)
For vselect: Cost == 1 (vector store) + N (scalar loads) + N*O (selects) + N (scalar stores) + 1 (vector load)

would be pretty accurate. Is that not enough? Do we need additional costs on the loads to account for splitting the operations among different dispatch groups?

Thanks again,
Hal

> +
> +  // Vector element insert/extract with Altivec is very expensive.
> +  // Until VSX is available, avoid vectorizing loops that require
> +  // these operations.
> +  if (Opcode == ISD::EXTRACT_VECTOR_ELT ||
> +      Opcode == ISD::INSERT_VECTOR_ELT)
> +    return Awful;
> +
> +  // We don't vectorize SREM/UREM so well.  Constrain the vectorizer
> +  // for those as well.
> +  if (Opcode == ISD::SREM || Opcode == ISD::UREM)
> +    return Awful;
> +
> +  // VSELECT is not yet implemented, leading to use of
> insert/extract
> +  // and ISEL, hence not a good idea.
> +  if (Opcode == ISD::VSELECT)
> +    return Awful;
> +
>    return TargetTransformInfo::getVectorInstrCost(Opcode, Val,
>    Index);
>  }
>  
> 
> 
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>