[cfe-dev] Patch to add __builtin_shufflevector

Thu Feb 28 21:18:36 PST 2008

On Thu, Feb 28, 2008 at 8:30 PM, Chris Lattner <clattner at apple.com> wrote:
>  This is very nice!  I have some minor stuff below, mostly little
>  details.
>
>  +  ShuffleVectorExpr(Expr **args, unsigned nexpr,
>  +                    QualType Type, SourceLocation BLoc,
>  +                    SourceLocation RP) :
>  +    Expr(ShuffleVectorExprClass, Type), BuiltinLoc(BLoc),
>  +    RParenLoc(RP), SubExprs(args), NumExprs(nexpr) {}
>
>  This needs a comment that says that "args" must be new[]'d, and that
>  ownership is transfered to the copy ctor.  Better yet, it would be
>  good if the ctor actually did the new[] + memcpy itself, instead of
>  having this behavior.

Okay.

>  +  ~ShuffleVectorExpr() {
>  +    delete [] SubExprs;
>  +  }
>
>  This should delete the children ASTs as well as the array.

Okay.

>  +  /// getExpr - Return the Expr at the specified index.
>  +  Expr *getExpr(unsigned Index) {
>  +    assert((Index < NumExprs) && "Arg access out of range!");
>  +    return SubExprs[Index];
>  +  }
>
>  This should probably also come in const form.  It would be nice to
>  have an "int getShuffleMaskIdx(unsigned N)" method that converts the
>  shuffle mask AST to a constant int and returns it, as a helper function.

Fine.  Although it's probably worth asking, how wide are vectors allowed to be?

>
>  --- Sema/SemaChecking.cpp       (revision 47702)
>  +++ Sema/SemaChecking.cpp       (working copy)
>  @@ -40,23 +40,36 @@
>  +    if (!CheckBuiltinCFStringArgument(TheCall->getArg(0))) {
>  +      delete TheCall;
>  +      return true;
>  +    }
>
>  SemaChecking should never delete the call on an error because
>  ActOnCallExpr has an OwningPtr that will delete it if sema is not
>  successful.  I think just removing the delete is ok from the error
>  paths.  Actually, with this code in SemaExpr:
>
>         if (FunctionDecl *FDecl = dyn_cast<FunctionDecl>(DRExpr-
>   >getDecl()))
>  -        if (CheckFunctionCall(FDecl, TheCall.get()))
>  -          return true;
>  +        return CheckFunctionCall(FDecl, TheCall.take());
>
>  the call will always be deleted.  I think this code should be:
>
>         if (FunctionDecl *FDecl = dyn_cast<FunctionDecl>(DRExpr-
>   >getDecl())) {
>  +        ExprResult R = CheckFunctionCall(FDecl, TheCall.take());
>           if (R.error) return;
>           TheCall = R.Val;
>         }

Hmm? I'm not quite following this bit... the current model is that if
we end up in CheckFunctionCall, it gets ownership of the call pointer
(because we call take), and then it either returns it or deletes it.
What exactly is the model you're suggesting?

>  +/// SemaBuiltinUnorderedCompare - Handle functions like
>  __builtin_isgreater and
>  +/// friends.  This is declared to take (...), so we have to check
>  everything.
>  +Action::ExprResult Sema::SemaBuiltinShuffleVector(CallExpr *TheCall) {
>
>  plz update the comment.

Okay.

>  +  if (!Context.typesAreCompatible(
>  +          TheCall->getArg(0)->getType().getUnqualifiedType(),
>  +          TheCall->getArg(1)->getType().getUnqualifiedType())) {
>  +    return Diag(TheCall->getLocEnd(),
>  diag::err_typecheck_call_too_few_args);
>  +  }
>
>  I think the (canonical) types should be required to be exactly
>  identical, not just compatible, what do you think?

The two aren't different in any substantial way, as far as I can tell, but okay.

>  +    if (!TheCall->getArg(i)->isIntegerConstantExpr(Result, Context))
>  +      return Diag(TheCall->getLocEnd(),
>  diag::err_typecheck_call_too_few_args);
>  +    if (Result >= llvm::APSInt(llvm::APInt(32, numElements*2), false))
>  +      return Diag(TheCall->getLocEnd(),
>  diag::err_typecheck_call_too_few_args);
>
>  These diags are good, but should be more descriptive.  Do you want to
>  support -1 as undef?

Ah, forgot the -1 == undef bit.  Okay.

>  I think that "Result >= llvm::APSInt(llvm::APInt(32, numElements*2)"
>  will segfault if the index is something like 1LL.  Try something like:

Okay.

>    Result >= llvm::APSInt(llvm::APInt(Result.getBitWidth(),
>  numElements*2))
>
>  oh except you also have to make the APSInt have the same sign as
>  Result (try 1U as an index), and if it's signed, you want to reject
>  things like -2.

Ah, right.

>  +  Expr** exprs = new Expr*[TheCall->getNumArgs()];
>  +
>  +  for (unsigned i = 0; i < TheCall->getNumArgs(); i++) {
>  +    exprs[i] = TheCall->getArg(i);
>  +    TheCall->setArg(i, 0);
>  +  }
>  +
>  +  ShuffleVectorExpr* E = new ShuffleVectorExpr(exprs, numElements+2,
>  FAType,
>  +                                               TheCall->getCallee()-
>   >getLocStart(),
>  +                                               TheCall-
>   >getRParenLoc());
>
>  Instead of doing the new in Sema here, it would be nice if sema used a
>  SmallvEctor<32>, and then had the ctor do the new/memcpy just to keep
>  the ownership simple and clear.  I think the getLocStart() line is too
>  long.

Okay.

>  +  for (unsigned i = 2; i < E->getNumSubExprs(); i++) {
>  +    indices.push_back(cast<llvm::Constant>(CGF.EmitScalarExpr(E-
>   >getExpr(i))));
>  +  }
>
>  It would be nice to use a helper on ShuffleVectorExpr to get the index
>  as an unsigned (evaluating as an i-c-e), instead of relying on
>  EmitScalarExpr to do it.  Also, it would be nice to handle the undef
>  case.

Fine.

>  As a random thought, it would be relatively easy to make
>  __builtin_shufflevector also support: __builtin_shufflevector(Vec,
>  idx1, idx2, idx3, idx4) - a shuffle with a single vector (implicitly
>  making the second vector be undef).  I think this would be handy, but
>  is jsut syntactic sugar and may not be worth it.  What do you think?

I don't think it's worth bothering with... you can always just do
__builtin_shufflevector(Vec, Vec, ...), and adding that would
significantly complicate the typechecking.  If it turns out to have
significant uses, we can consider adding it.

>  Thanks for working on this!

No problem.

-Eli