[PATCH] #pragma vectorize

Mon May 12 07:07:23 PDT 2014

More comments in the review, as well as some replies to previous email.

On Fri, May 9, 2014 at 3:59 PM, Tyler Nowicki <tnowicki at apple.com> wrote:
> Hi Aaron, Richard, Ben,
>
> Thanks for all the feedback. I’ve tried to integrate all of your suggested.
> I may have missed some, please let me know if I did.
>
> Please review the attached patch.
>
> Tyler
>
>
>
>
> +  // Add vectorize hints to the metadata on the conditional branch.
> +  // Iterate in reverse so hints are put in the same order they appear.
> +  SmallVector<llvm::Value*, 2> Metadata(1);
> +  ArrayRef<const Attr*>::reverse_iterator I;
> +  for (I = Attrs.rbegin(); I != Attrs.rend(); ++I) {
>
>
> Check against rend() should be hoisted (or use a range-based loop with
> an adaptor).
>
>
> Sorry, I don’t understand what you are suggesting. Do we not compare against
> rend() in the loop condition?

We do, but it is more idiomatic in the codebase to hoist the call so
it's not made on every loop iteration. Eg)

for (auto I = Attrs.rbegin(), E = Attrs.rend(); I != E; ++I)

> I’m pretty sure a spelling is required unless you are providing the base
> class outside the td file. I’m looking at modifying the way pretty printing
> works now to generate a printer that prints the pragma. But it seems bad to
> specialize the code that generates the pretty printer for printing loop
> pragmas
>
>
> Why?
>
>
> That part of tablegen probably shouldn’t be specialized for each type of
> pragma should it? But from your comment below it sounds like thats what you
> are thinking.

I'd have to think about it more, but it seems like tablegen shouldn't
have to specialize for each pragma, just all pragmas. Eg) the
difference between printing pragmas and printing attributes is minor
enough that it could be handled entirely by tablegen without the
pragma authors having to write special code.

> Perhaps we should provide a separate method for printing pragmas rather than
> using printPretty. Right now printPretty returns the attribute's text and
> StmtPrinter::VisitAttributeStmt puts that between ‘[[‘ and ‘]]’. We need
> some way of putting the pragmas prior to the attribute. We could first loop
> over all attributes calling printPragma, and then again using printPretty to
> generate the attributes. Where printPragma would generate directives like
> #pragma loop ...

Yes, we would have to separate things out a bit more.

>
> and so it will still be necessary to make some kind of a call into
> the LoopHintAttr to print the pragma.
>
>
> So what I was envisioning was that there would be a Pragma spelling
> added to Attr.td, which ClangAttrEmitter.cpp could then key off of for
> everything which is spelling-specific (you can search for CXX11 or
> Declspec in the file and that should show you the places that may need
> modification). However, that's orthogonal to your functionality, and
> not really something you should have to implement yourself just to get
> this patch in. I just don't have the time to tackle it myself right
> now.
>
>
> Sounds good. I started looking at this I have a basic patch for Pragma
> spelling. After #pragma loop I’ll look at that again.
>
>
> So I guess what I am saying is: once there's a new patch that
> addresses most of the concerns (aside from stuff like modifying the
> way the attr tablegen works), we can see if there's anything else
> blocking your patch. Don't worry about modifying tablegen for this,
> that's more a FIXME for the future.

> Index: include/clang/AST/Attr.h
> ===================================================================
> --- include/clang/AST/Attr.h (revision 208442)
> +++ include/clang/AST/Attr.h (working copy)
> @@ -16,6 +16,7 @@
>
>  #include "clang/AST/AttrIterator.h"
>  #include "clang/AST/Decl.h"
> +#include "clang/AST/Expr.h"
>  #include "clang/AST/Type.h"
>  #include "clang/Basic/AttrKinds.h"
>  #include "clang/Basic/LLVM.h"
> Index: include/clang/Basic/Attr.td
> ===================================================================
> --- include/clang/Basic/Attr.td (revision 208442)
> +++ include/clang/Basic/Attr.td (working copy)
> @@ -1750,3 +1750,65 @@
>  def Unaligned : IgnoredAttr {
>    let Spellings = [Keyword<"__unaligned">];
>  }
> +
> +def LoopHint : Attr {
> +  // LoopHint Vectorize:
> +  //  enable - use vector instructions.
> +  //  disable - do not use vector instructions.
> +  //  positive value - use vector instructions of the specified width.
> +
> +  // LoopHint Interleave:
> +  //  enable - interleave multiple loop iterations.
> +  //  disable - do not interleave multiple loop interactions.
> +  //  positive value - interleave the specified number of loop interations.
> +
> +  // FIXME: Add Pragma spelling to tablegen and
> +  // use it here.
> +  let Spellings = [Keyword<"loop">];
> +
> +  // State of the loop optimization specified by the spelling.
> +  let Args = [EnumArgument<"Option", "OptionType",
> +                          ["vectorize", "interleave"],
> +                          ["Vectorize", "Interleave"]>,
> +              EnumArgument<"Kind","KindType",

Missing a space after the comma.

> +                          ["disable", "enable", "value"],
> +                          ["Disable", "Enable", "Value"]>,

This is actually an optional argument as well, but is not marked as
such. It should get a , 1. Also, this suggests we need a new argument
type that represents a union of arguments, since that's really what
you want (one of these two arguments must be used, but you don't care
which). A FIXME would probably be appropriate (though you don't have
to implement the functionality for this patch).

> +              ExprArgument<"Value", 1>];

Judging by the tests, this should be a DefaultIntArgument<"Value", 1>.
Either that, or there are tests missing where expressions are used
(and honestly, it would strike me as slightly strange to allow general
expressions here).

> +
> +  let AdditionalMembers = [{
> +  // Kinds are compatible if they are not exclusive.
> +  static bool isCompatible(int Kind1, int Kind2) {
> +    return (Kind1==Disable) == (Kind2==Disable);

Missing spaces around the inner ==.

> +  }
> +
> +  static StringRef getOptionName(int Option) {
> +    switch (Option) {
> +    case Vectorize: return "vectorize";
> +    case Interleave: return "interleave";
> +    }
> +    return "unknown";

Seems like this should be an llvm_unreachable instead of returning unknown.

> +  }
> +
> +  static StringRef getKindName(int Kind) {
> +    switch (Kind) {
> +    case Disable: return "disable";
> +    case Enable: return "enable";
> +    case Value: return "value";
> +    }
> +    return "unknown";

Same here.

> +  }
> +
> +  // FIXME: Modify pretty printer to print this pragma
> +  void print(raw_ostream &OS, const PrintingPolicy &Policy) const {
> +    OS << getOptionName(option) << "(";
> +    if (getKind() == Value) {
> +      getValue()->printPretty(OS, 0, Policy);
> +    } else {
> +      OS << getKindName(getKind());
> +    }

No curly braces on single-line if/else statements.

> +    OS << ")\n";
> +  }
> +  }];
> +
> +  let Documentation = [Undocumented];
> +}
> Index: include/clang/Basic/DiagnosticParseKinds.td
> ===================================================================
> --- include/clang/Basic/DiagnosticParseKinds.td (revision 208442)
> +++ include/clang/Basic/DiagnosticParseKinds.td (working copy)
> @@ -884,6 +884,12 @@
>    "unexpected OpenMP clause '%0' in directive '#pragma omp %1'">;
>  def err_omp_more_one_clause : Error<
>    "directive '#pragma omp %0' cannot contain more than one '%1' clause">;
> +
> +// Pragma loop support.
> +def err_pragma_loop_invalid_option : Error<
> +  "%select{invalid|missing}0 option%select{ '%1'|}0 in directive '#pragma loop'; expected either vectorize or interleave">;

80-col limit.

> +def err_pragma_loop_invalid_type : Error<
> +  "invalid value '%0' in directive '#pragma loop %1'; expected either 'enable', 'disable', or a positive integer">;

Same here.

>  } // end of Parse Issue category.
>
>  let CategoryName = "Modules Issue" in {
> Index: include/clang/Basic/DiagnosticSemaKinds.td
> ===================================================================
> --- include/clang/Basic/DiagnosticSemaKinds.td (revision 208442)
> +++ include/clang/Basic/DiagnosticSemaKinds.td (working copy)
> @@ -537,6 +537,12 @@
>    "#pragma visibility pop with no matching #pragma visibility push">;
>  def note_surrounding_namespace_starts_here : Note<
>    "surrounding namespace with visibility attribute starts here">;
> +def err_pragma_loop_invalid_value : Error<
> +  "expected a positive integer in directive '#pragma loop %0'">;
> +def err_pragma_loop_incompatible : Error<
> +  "'%0' and '%1' directive option types are incompatible in '#pragma loop %2'">;
> +def err_pragma_loop_precedes_nonloop : Error<
> +  "expected a for, while, or do-while loop to follow the '#pragma loop' directive">;

80-col limit.

>
>  /// Objective-C parser diagnostics
>  def err_duplicate_class_def : Error<
> Index: include/clang/Basic/TokenKinds.def
> ===================================================================
> --- include/clang/Basic/TokenKinds.def (revision 208442)
> +++ include/clang/Basic/TokenKinds.def (working copy)
> @@ -701,6 +701,11 @@
>  ANNOTATION(pragma_openmp)
>  ANNOTATION(pragma_openmp_end)
>
> +// Annotations for loop pragma directives #pragma loop ...
> +// The lexer produces these so that they only take effect when the parser
> +// handles #pragma loop ... directives.
> +ANNOTATION(pragma_loop_hint)
> +
>  // Annotations for module import translated from #include etc.
>  ANNOTATION(module_include)
>  ANNOTATION(module_begin)
> Index: include/clang/Parse/Parser.h
> ===================================================================
> --- include/clang/Parse/Parser.h (revision 208442)
> +++ include/clang/Parse/Parser.h (working copy)
> @@ -20,6 +20,7 @@
>  #include "clang/Lex/CodeCompletionHandler.h"
>  #include "clang/Lex/Preprocessor.h"
>  #include "clang/Sema/DeclSpec.h"
> +#include "clang/Sema/LoopHint.h"
>  #include "clang/Sema/Sema.h"
>  #include "llvm/ADT/SmallVector.h"
>  #include "llvm/Support/Compiler.h"
> @@ -160,6 +161,7 @@
>    std::unique_ptr<PragmaHandler> MSConstSeg;
>    std::unique_ptr<PragmaHandler> MSCodeSeg;
>    std::unique_ptr<PragmaHandler> MSSection;
> +  std::unique_ptr<PragmaHandler> LoopHintHandler;
>
>    std::unique_ptr<CommentHandler> CommentSemaHandler;
>
> @@ -518,6 +520,10 @@
>    /// #pragma clang __debug captured
>    StmtResult HandlePragmaCaptured();
>
> +  /// \brief Handle the annotation token produced for
> +  /// #pragma vectorize...
> +  LoopHint HandlePragmaLoopHint();
> +
>    /// GetLookAheadToken - This peeks ahead N tokens and returns that token
>    /// without consuming any tokens.  LookAhead(0) returns 'Tok', LookAhead(1)
>    /// returns the token after Tok, etc.
> @@ -1600,6 +1606,10 @@
>    StmtResult ParseReturnStatement();
>    StmtResult ParseAsmStatement(bool &msAsm);
>    StmtResult ParseMicrosoftAsmStatement(SourceLocation AsmLoc);
> +  StmtResult ParsePragmaLoopHint(StmtVector &Stmts,
> +                                 bool OnlyStatement,
> +                                 SourceLocation *TrailingElseLoc,
> +                                 ParsedAttributesWithRange &Attrs);
>
>    /// \brief Describes the behavior that should be taken for an __if_exists
>    /// block.
> Index: include/clang/Sema/LoopHint.h
> ===================================================================
> --- include/clang/Sema/LoopHint.h (revision 0)
> +++ include/clang/Sema/LoopHint.h (working copy)
> @@ -0,0 +1,32 @@
> +//===--- LoopHint.h - Types for LoopHint ------------------------*- C++ -*-===//
> +//
> +//                     The LLVM Compiler Infrastructure
> +//
> +// This file is distributed under the University of Illinois Open Source
> +// License. See LICENSE.TXT for details.
> +//
> +//===----------------------------------------------------------------------===//
> +
> +
> +#ifndef LLVM_CLANG_SEMA_LOOPHINT_H
> +#define LLVM_CLANG_SEMA_LOOPHINT_H
> +
> +#include "clang/Basic/IdentifierTable.h"
> +#include "clang/Basic/SourceLocation.h"
> +#include "clang/Sema/AttributeList.h"
> +#include "clang/Sema/Ownership.h"
> +
> +namespace clang {
> +
> +/// \brief Loop hint specified by a pragma loop directive.
> +struct LoopHint {
> +  SourceRange Range;
> +  Expr* ValueExpr;

Asterisk on the right, instead of the left.

> +  IdentifierLoc *LoopLoc;
> +  IdentifierLoc *ValueLoc;
> +  IdentifierLoc *OptionLoc;
> +};
> +
> +} // end namespace clang
> +
> +#endif // LLVM_CLANG_SEMA_LOOPHINT_H
> Index: lib/AST/StmtPrinter.cpp
> ===================================================================
> --- lib/AST/StmtPrinter.cpp (revision 208442)
> +++ lib/AST/StmtPrinter.cpp (working copy)
> @@ -168,19 +168,34 @@
>  }
>
>  void StmtPrinter::VisitAttributedStmt(AttributedStmt *Node) {
> -  OS << "[[";
>    bool first = true;
> -  for (ArrayRef<const Attr*>::iterator it = Node->getAttrs().begin(),
> -                                       end = Node->getAttrs().end();
> -                                       it != end; ++it) {
> -    if (!first) {
> -      OS << ", ";
> -      first = false;
> +  std::string raw_attr_os;
> +  llvm::raw_string_ostream AttrOS(raw_attr_os);
> +  for (ArrayRef<const Attr*>::reverse_iterator it = Node->getAttrs().rbegin(),
> +                                               end = Node->getAttrs().rend();
> +                                               it != end; ++it) {

Good time to update this to use "I" instead of "it" (similar for
"end"). Also, (I apologize if this was covered elsewhere) why did this
code switch from using begin/end to rbegin/rend?

> +    // FIXME: This hack will be removed when printPretty
> +    // has been modified to print pretty pragmas

Missing a period.

> +    if (dyn_cast<LoopHintAttr>(*it)) {
> +      const LoopHintAttr *LHA = cast<LoopHintAttr>(*it);
> +      OS << "#pragma loop ";
> +      LHA->print(OS, Policy);
> +    } else {
> +      if (!first) {
> +        AttrOS << ", ";
> +        first = false;
> +      }
> +      // TODO: check this
> +      (*it)->printPretty(AttrOS, Policy);
>      }
> -    // TODO: check this
> -    (*it)->printPretty(OS, Policy);
>    }
> -  OS << "]] ";
> +
> +  // Check to see if any attributes were printed
> +  StringRef AttrStr = AttrOS.str();
> +  if (!AttrStr.empty()) {
> +    OS << "[[" << AttrStr << "]] ";
> +  }

Should elide the curly braces.

> +
>    PrintStmt(Node->getSubStmt(), 0);
>  }
>
> Index: lib/CodeGen/CGStmt.cpp
> ===================================================================
> --- lib/CodeGen/CGStmt.cpp (revision 208442)
> +++ lib/CodeGen/CGStmt.cpp (working copy)
> @@ -18,6 +18,7 @@
>  #include "clang/AST/StmtVisitor.h"
>  #include "clang/Basic/PrettyStackTrace.h"
>  #include "clang/Basic/TargetInfo.h"
> +#include "clang/Sema/LoopHint.h"
>  #include "clang/Sema/SemaDiagnostic.h"
>  #include "llvm/ADT/StringExtras.h"
>  #include "llvm/IR/CallSite.h"
> @@ -396,7 +397,23 @@
>  }
>
>  void CodeGenFunction::EmitAttributedStmt(const AttributedStmt &S) {
> -  EmitStmt(S.getSubStmt());
> +  const Stmt *SubStmt = S.getSubStmt();
> +  switch (SubStmt->getStmtClass()) {
> +  case Stmt::DoStmtClass:
> +    EmitDoStmt(cast<DoStmt>(*SubStmt), S.getAttrs());
> +    break;
> +  case Stmt::ForStmtClass:
> +    EmitForStmt(cast<ForStmt>(*SubStmt), S.getAttrs());
> +    break;
> +  case Stmt::WhileStmtClass:
> +    EmitWhileStmt(cast<WhileStmt>(*SubStmt), S.getAttrs());
> +    break;
> +  case Stmt::CXXForRangeStmtClass:
> +    EmitCXXForRangeStmt(cast<CXXForRangeStmt>(*SubStmt), S.getAttrs());
> +    break;
> +  default:
> +    EmitStmt(SubStmt);
> +  }
>  }
>
>  void CodeGenFunction::EmitGotoStmt(const GotoStmt &S) {
> @@ -502,7 +519,86 @@
>    EmitBlock(ContBlock, true);
>  }
>
> -void CodeGenFunction::EmitWhileStmt(const WhileStmt &S) {
> +void CodeGenFunction::EmitCondBrHints(llvm::LLVMContext &Context,
> +                                      llvm::BranchInst *CondBr,
> +                                      ArrayRef<const Attr*> &Attrs) {
> +  // Do not continue if there are not hints.
> +  if (Attrs.empty())
> +    return;
> +
> +  // Add vectorize hints to the metadata on the conditional branch.
> +  // Iterate in reverse so hints are put in the same order they appear.
> +  SmallVector<llvm::Value*, 2> Metadata(1);
> +  ArrayRef<const Attr*>::reverse_iterator I;

No need for this to live outside of the for loop. Should be scoped to
the loop, and you can use auto if length is worrisome.

> +  for (I = Attrs.rbegin(); I != Attrs.rend(); ++I) {
> +    const LoopHintAttr *LH = dyn_cast<LoopHintAttr>(*I);
> +
> +    // Skip non loop hint attributes
> +    if (!LH)
> +      continue;
> +
> +    LoopHintAttr::OptionType Option = LH->getOption();
> +    Expr *ValueExpr = LH->getValue();
> +    int Kind = LH->getKind();
> +
> +    int ValueInt = 1;
> +    if (Kind == LoopHintAttr::Value) {
> +      llvm::APSInt ValueAPS =
> +        ValueExpr->EvaluateKnownConstInt(CGM.getContext());
> +      ValueInt = ValueAPS.getSExtValue();
> +    }
> +
> +    llvm::Value *Value;
> +    llvm::MDString *Name;
> +    if (Option == LoopHintAttr::Vectorize) {
> +      switch(Kind) {
> +      case LoopHintAttr::Disable:
> +        Name = llvm::MDString::get(Context, "llvm.vectorizer.width");
> +        Value = llvm::ConstantInt::get(Int32Ty, 1);
> +        break;
> +      case LoopHintAttr::Enable:
> +        Name = llvm::MDString::get(Context, "llvm.vectorizer.enable");
> +        Value = Builder.getTrue();
> +        break;
> +      case LoopHintAttr::Value:
> +        Name = llvm::MDString::get(Context, "llvm.vectorizer.width");
> +        Value = llvm::ConstantInt::get(Int32Ty, ValueInt);
> +        break;
> +      }
> +    } else if (Option == LoopHintAttr::Interleave) {
> +      switch(Kind) {
> +      case LoopHintAttr::Disable:
> +        Name = llvm::MDString::get(Context, "llvm.vectorizer.unroll");
> +        Value = llvm::ConstantInt::get(Int32Ty, 1);
> +        break;
> +      case LoopHintAttr::Enable:
> +        Name = llvm::MDString::get(Context, "llvm.vectorizer.enable");
> +        Value = Builder.getTrue();
> +        break;

This is identical to the vectorize case; is that expected?

> +      case LoopHintAttr::Value:
> +        Name = llvm::MDString::get(Context, "llvm.vectorizer.unroll");
> +        Value = llvm::ConstantInt::get(Int32Ty, ValueInt);
> +        break;
> +      }
> +    }

It seems as though the values are identical, regardless of the option,
and the names are a simple mapping. Perhaps this could be more clearly
expressed as:

const char *Names[] = { "llvm.vectorizer.width", "llvm.vectorizer.unroll" };
llvm::Value *Value;
llvm::MDString *Name;

if (Kind == LoopHintAttr::Enable) {
  Name = llvm::MDString::get(Context, "llvm.vectorizer.enable");
  Value = Builder.getTrue();
} else {
  Name = llvm::MDString::get(Context, Names[Option]);
  Value = llvm::ConstantInt::get(Int32Ty, ValueInt); // You already
set ValueInt to 1 by default, and overwrite when the Kind is a Value.
}

> +
> +    SmallVector<llvm::Value*, 2> OpValues;
> +    OpValues.push_back(Name);
> +    OpValues.push_back(Value);
> +
> +    // Set or overwrite metadata indicated by Name.
> +    Metadata.push_back(llvm::MDNode::get(Context, OpValues));
> +  }
> +
> +  // Add llvm.loop MDNode to CondBr.
> +  llvm::MDNode *LoopID = llvm::MDNode::get(Context, Metadata);
> +  LoopID->replaceOperandWith(0, LoopID); // First op points to itself.
> +
> +  CondBr->setMetadata("llvm.loop", LoopID);
> +}
> +
> +void CodeGenFunction::EmitWhileStmt(const WhileStmt &S,
> +                                    ArrayRef<const Attr*> WhileAttrs) {
>    RegionCounter Cnt = getPGORegionCounter(&S);
>
>    // Emit the header for the loop, which will also become
> @@ -547,13 +643,17 @@
>      llvm::BasicBlock *ExitBlock = LoopExit.getBlock();
>      if (ConditionScope.requiresCleanups())
>        ExitBlock = createBasicBlock("while.exit");
> -    Builder.CreateCondBr(BoolCondVal, LoopBody, ExitBlock,
> -                         PGO.createLoopWeights(S.getCond(), Cnt));
> +    llvm::BranchInst *CondBr = Builder.CreateCondBr(
> +                               BoolCondVal, LoopBody, ExitBlock,
> +                               PGO.createLoopWeights(S.getCond(), Cnt));
>
>      if (ExitBlock != LoopExit.getBlock()) {
>        EmitBlock(ExitBlock);
>        EmitBranchThroughCleanup(LoopExit);
>      }
> +
> +    // Attach metadata to loop body conditional branch.
> +    EmitCondBrHints(LoopBody->getContext(), CondBr, WhileAttrs);
>    }
>
>    // Emit the loop body.  We have to emit this in a cleanup scope
> @@ -582,7 +682,8 @@
>      SimplifyForwardingBlocks(LoopHeader.getBlock());
>  }
>
> -void CodeGenFunction::EmitDoStmt(const DoStmt &S) {
> +void CodeGenFunction::EmitDoStmt(const DoStmt &S,
> +                                 ArrayRef<const Attr*> DoAttrs) {
>    JumpDest LoopExit = getJumpDestInCurrentScope("do.end");
>    JumpDest LoopCond = getJumpDestInCurrentScope("do.cond");
>
> @@ -619,10 +720,15 @@
>        EmitBoolCondBranch = false;
>
>    // As long as the condition is true, iterate the loop.
> -  if (EmitBoolCondBranch)
> -    Builder.CreateCondBr(BoolCondVal, LoopBody, LoopExit.getBlock(),
> -                         PGO.createLoopWeights(S.getCond(), Cnt));
> +  if (EmitBoolCondBranch) {
> +    llvm::BranchInst *CondBr = Builder.CreateCondBr(
> +                                 BoolCondVal, LoopBody, LoopExit.getBlock(),
> +                                 PGO.createLoopWeights(S.getCond(), Cnt));
>
> +    // Attach metadata to loop body conditional branch.
> +    EmitCondBrHints(LoopBody->getContext(), CondBr, DoAttrs);
> +  }
> +
>    // Emit the exit block.
>    EmitBlock(LoopExit.getBlock());
>
> @@ -632,7 +738,8 @@
>      SimplifyForwardingBlocks(LoopCond.getBlock());
>  }
>
> -void CodeGenFunction::EmitForStmt(const ForStmt &S) {
> +void CodeGenFunction::EmitForStmt(const ForStmt &S,
> +                                  ArrayRef<const Attr*> ForAttrs) {
>    JumpDest LoopExit = getJumpDestInCurrentScope("for.end");
>
>    RunCleanupsScope ForScope(*this);
> @@ -686,9 +793,13 @@
>      // C99 6.8.5p2/p4: The first substatement is executed if the expression
>      // compares unequal to 0.  The condition must be a scalar type.
>      llvm::Value *BoolCondVal = EvaluateExprAsBool(S.getCond());
> -    Builder.CreateCondBr(BoolCondVal, ForBody, ExitBlock,
> -                         PGO.createLoopWeights(S.getCond(), Cnt));
> +    llvm::BranchInst *CondBr = Builder.CreateCondBr(
> +                                 BoolCondVal, ForBody, ExitBlock,
> +                                 PGO.createLoopWeights(S.getCond(), Cnt));
>
> +    // Attach metadata to loop body conditional branch.
> +    EmitCondBrHints(ForBody->getContext(), CondBr, ForAttrs);
> +
>      if (ExitBlock != LoopExit.getBlock()) {
>        EmitBlock(ExitBlock);
>        EmitBranchThroughCleanup(LoopExit);
> @@ -728,7 +839,8 @@
>    EmitBlock(LoopExit.getBlock(), true);
>  }
>
> -void CodeGenFunction::EmitCXXForRangeStmt(const CXXForRangeStmt &S) {
> +void CodeGenFunction::EmitCXXForRangeStmt(const CXXForRangeStmt &S,
> +                                          ArrayRef<const Attr*> ForAttrs) {
>    JumpDest LoopExit = getJumpDestInCurrentScope("for.end");
>
>    RunCleanupsScope ForScope(*this);
> @@ -761,9 +873,13 @@
>    // The body is executed if the expression, contextually converted
>    // to bool, is true.
>    llvm::Value *BoolCondVal = EvaluateExprAsBool(S.getCond());
> -  Builder.CreateCondBr(BoolCondVal, ForBody, ExitBlock,
> -                       PGO.createLoopWeights(S.getCond(), Cnt));
> +  llvm::BranchInst *CondBr = Builder.CreateCondBr(BoolCondVal, ForBody,
> +                                      ExitBlock,
> +                                      PGO.createLoopWeights(S.getCond(), Cnt));
>
> +  // Attach metadata to loop body conditional branch
> +  EmitCondBrHints(ForBody->getContext(), CondBr, ForAttrs);
> +
>    if (ExitBlock != LoopExit.getBlock()) {
>      EmitBlock(ExitBlock);
>      EmitBranchThroughCleanup(LoopExit);
> Index: lib/CodeGen/CodeGenFunction.h
> ===================================================================
> --- lib/CodeGen/CodeGenFunction.h (revision 208442)
> +++ lib/CodeGen/CodeGenFunction.h (working copy)
> @@ -1846,9 +1846,16 @@
>    void EmitGotoStmt(const GotoStmt &S);
>    void EmitIndirectGotoStmt(const IndirectGotoStmt &S);
>    void EmitIfStmt(const IfStmt &S);
> -  void EmitWhileStmt(const WhileStmt &S);
> -  void EmitDoStmt(const DoStmt &S);
> -  void EmitForStmt(const ForStmt &S);
> +
> +  void EmitCondBrHints(llvm::LLVMContext &Context,
> +                       llvm::BranchInst *CondBr,
> +                       ArrayRef<const Attr*> &Attrs);
> +  void EmitWhileStmt(const WhileStmt &S,
> +                     ArrayRef<const Attr*> Attrs = ArrayRef<const Attr*>());
> +  void EmitDoStmt(const DoStmt &S,
> +                  ArrayRef<const Attr*> Attrs = ArrayRef<const Attr*>());
> +  void EmitForStmt(const ForStmt &S,
> +                   ArrayRef<const Attr*> Attrs = ArrayRef<const Attr*>());

In all cases, ArrayRef<const Attr*> is missing a space between the
identifier and the asterisk.

>    void EmitReturnStmt(const ReturnStmt &S);
>    void EmitDeclStmt(const DeclStmt &S);
>    void EmitBreakStmt(const BreakStmt &S);
> @@ -1872,7 +1879,8 @@
>
>    void EmitCXXTryStmt(const CXXTryStmt &S);
>    void EmitSEHTryStmt(const SEHTryStmt &S);
> -  void EmitCXXForRangeStmt(const CXXForRangeStmt &S);
> +  void EmitCXXForRangeStmt(const CXXForRangeStmt &S,
> +                       ArrayRef<const Attr*> Attrs = ArrayRef<const Attr*>());

Same here.

>
>    llvm::Function *EmitCapturedStmt(const CapturedStmt &S, CapturedRegionKind K);
>    llvm::Function *GenerateCapturedStmtFunction(const CapturedDecl *CD,
> Index: lib/Parse/ParsePragma.cpp
> ===================================================================
> --- lib/Parse/ParsePragma.cpp (revision 208442)
> +++ lib/Parse/ParsePragma.cpp (working copy)
> @@ -15,6 +15,7 @@
>  #include "clang/Lex/Preprocessor.h"
>  #include "clang/Parse/ParseDiagnostic.h"
>  #include "clang/Parse/Parser.h"
> +#include "clang/Sema/LoopHint.h"
>  #include "clang/Sema/Scope.h"
>  #include "llvm/ADT/StringSwitch.h"
>  using namespace clang;
> @@ -131,6 +132,12 @@
>                      Token &FirstToken) override;
>  };
>
> +struct PragmaLoopHintHandler : public PragmaHandler {
> +  PragmaLoopHintHandler() : PragmaHandler("loop") {}
> +  void HandlePragma(Preprocessor &PP, PragmaIntroducerKind Introducer,
> +                    Token &FirstToken) override;
> +};
> +
>  }  // end namespace
>
>  void Parser::initializePragmaHandlers() {
> @@ -195,6 +202,9 @@
>      MSSection.reset(new PragmaMSPragma("section"));
>      PP.AddPragmaHandler(MSSection.get());
>    }
> +
> +  LoopHintHandler.reset(new PragmaLoopHintHandler());
> +  PP.AddPragmaHandler(LoopHintHandler.get());
>  }
>
>  void Parser::resetPragmaHandlers() {
> @@ -249,6 +259,10 @@
>
>    PP.RemovePragmaHandler("STDC", FPContractHandler.get());
>    FPContractHandler.reset();
> +
> +  PP.RemovePragmaHandler(LoopHintHandler.get());
> +  LoopHintHandler.reset();
> +
>  }
>
>  /// \brief Handle the annotation token produced for #pragma unused(...)
> @@ -570,6 +584,40 @@
>        DiagnosticsEngine::Error, "'#pragma %0' not implemented.");
>  }
>
> +struct PragmaLoopHintInfo {
> +  Token Loop;
> +  Token Value;
> +  Token Option;
> +};
> +
> +LoopHint Parser::HandlePragmaLoopHint() {
> +  assert(Tok.is(tok::annot_pragma_loop_hint));
> +  PragmaLoopHintInfo *Info =
> +    static_cast<PragmaLoopHintInfo *>(Tok.getAnnotationValue());
> +
> +  LoopHint Hint;
> +  Hint.LoopLoc = IdentifierLoc::create(Actions.Context,
> +                                   Info->Loop.getLocation(),
> +                                   Info->Loop.getIdentifierInfo());
> +  Hint.OptionLoc = IdentifierLoc::create(Actions.Context,
> +                                   Info->Option.getLocation(),
> +                                   Info->Option.getIdentifierInfo());
> +  Hint.ValueLoc = IdentifierLoc::create(Actions.Context,
> +                                   Info->Value.getLocation(),
> +                                   Info->Value.getIdentifierInfo());

Identation on all of these.

> +  Hint.Range = SourceRange(Info->Option.getLocation(),
> +                           Info->Value.getLocation());
> +
> +  // FIXME: We should support template parameters for the loop hint value.
> +  // See bug report #19610
> +  if (Info->Value.is(tok::numeric_constant))
> +    Hint.ValueExpr = Actions.ActOnNumericConstant(Info->Value).get();
> +  else
> +    Hint.ValueExpr = nullptr;
> +
> +  return Hint;
> +}
> +
>  // #pragma GCC visibility comes in two variants:
>  //   'push' '(' [visibility] ')'
>  //   'pop'
> @@ -1531,3 +1579,112 @@
>
>    Actions.ActOnPragmaMSComment(Kind, ArgumentString);
>  }
> +
> +/// \brief Handle the \#pragma vectorize hint.
> +///  #pragma 'loop' loop-hints
> +///
> +///  loop-hints:
> +///    loop-hint loop-hints[opt]
> +///
> +///  loop-hint:
> +///    'vectorize' '(' loop-hint-value ')'
> +///    'interleave' '(' loop-hint-value ')'
> +///
> +///  loop-hint-value:
> +///    'enable'
> +///    'disable'
> +///    constant-expression
> +///
> +/// Specifying vectorize(enable) or vectorize(_value_) instructs llvm to
> +/// try vectorize the instructions of the loop it precedes. Specifying

"to try vectorize" doesn't sound right.

> +/// interleave(enable) or interleave(_value_) instructs llvm to interleaving
> +/// multiple iterations of the loop it precedes. The _value_ indicates the

"to interleaving multiple iterations" doesn't sound right either.

> +/// width of the vector instructions or the number of iterations of the loop
> +/// that should be interleaved. Consequently, a value of 1 or disable prevents
> +/// the optimization, even if it is possible and profitable, and 0 is invalid.
> +/// The loop vectorizer currently only works on inner loops.
> +///
> +void PragmaLoopHintHandler::HandlePragma(Preprocessor &PP,
> +                                         PragmaIntroducerKind Introducer,
> +                                         Token &Tok) {
> +  Token Loop = Tok;
> +  SmallVector <Token, 1> TokenList;
> +
> +  // Lex the optimization option and verify it is an identifier.
> +  PP.Lex(Tok);
> +  if (Tok.isNot(tok::identifier)) {
> +    bool InvalidOption = true;

Why is this variable needed?

> +    PP.Diag(Tok.getLocation(), diag::err_pragma_loop_invalid_option)
> +      << InvalidOption << "";
> +    return;
> +  }
> +
> +  while(Tok.is(tok::identifier)) {
> +    Token Option = Tok;
> +    StringRef OptionName = Tok.getIdentifierInfo()->getName();
> +
> +    if (OptionName != "vectorize" && OptionName != "interleave") {
> +      bool InvalidOption = false;

Same for this variable.

In both cases, the idiomatic approach is to use 0/1/true/false and an
inline comment.

> +      PP.Diag(Tok.getLocation(), diag::err_pragma_loop_invalid_option)
> +        << InvalidOption << OptionName;
> +      return;
> +    }
> +
> +    // Read '('
> +    PP.Lex(Tok);
> +    if (Tok.isNot(tok::l_paren)) {

Hmmm, maybe someday we should generalize BalancedDelimiterTracker so
it would work in the preprocessor as well. Oh well, that's a task for
another day.

> +      PP.Diag(Tok.getLocation(), diag::err_expected) << tok::l_paren;
> +      return;
> +    }
> +
> +    // Read the optimization value identifier.

Comment doesn't match the code.

> +    PP.Lex(Tok);
> +    if (Tok.isNot(tok::identifier) && Tok.isNot(tok::numeric_constant)) {
> +      PP.Diag(Tok.getLocation(), diag::err_pragma_loop_invalid_type)
> +        << Tok.getName();
> +      return;
> +    }
> +
> +    Token Value = Tok;
> +
> +    // Read ')'
> +    PP.Lex(Tok);
> +    if (Tok.isNot(tok::r_paren)) {
> +      PP.Diag(Tok.getLocation(), diag::err_expected) << tok::r_paren;
> +      return;
> +    }
> +
> +    // Get next optimization option.
> +    PP.Lex(Tok);
> +
> +    // Read the optimization option identifier.

Comment doesn't match code.

> +    PragmaLoopHintInfo *Info =
> +      (PragmaLoopHintInfo*) PP.getPreprocessorAllocator().Allocate(
> +        sizeof(PragmaLoopHintInfo), llvm::alignOf<PragmaLoopHintInfo>());
> +
> +    Info->Loop = Loop;
> +    Info->Option = Option;
> +    Info->Value = Value;
> +
> +    // Generate the vectorization hint token.
> +    Token LoopHintTok;
> +    LoopHintTok.startToken();
> +    LoopHintTok.setKind(tok::annot_pragma_loop_hint);
> +    LoopHintTok.setLocation(Loop.getLocation());
> +    LoopHintTok.setAnnotationValue(static_cast<void*>(Info));
> +    TokenList.push_back(LoopHintTok);
> +  }
> +
> +  if (Tok.isNot(tok::eod)) {
> +    PP.Diag(Tok.getLocation(), diag::warn_pragma_extra_tokens_at_eol)
> +      << "loop";
> +    return;
> +  }
> +
> +  Token *TokenArray = new Token[TokenList.size()];
> +  std::copy(TokenList.begin(), TokenList.end(), TokenArray);
> +
> +  PP.EnterTokenStream(TokenArray, TokenList.size(),
> +                      /*DisableMacroExpansion=*/false,
> +                      /*OwnsTokens=*/true);
> +}
> Index: lib/Parse/ParseStmt.cpp
> ===================================================================
> --- lib/Parse/ParseStmt.cpp (revision 208442)
> +++ lib/Parse/ParseStmt.cpp (working copy)
> @@ -15,11 +15,13 @@
>  #include "clang/Parse/Parser.h"
>  #include "RAIIObjectsForParser.h"
>  #include "clang/AST/ASTContext.h"
> +#include "clang/Basic/Attributes.h"
>  #include "clang/Basic/Diagnostic.h"
>  #include "clang/Basic/PrettyStackTrace.h"
>  #include "clang/Basic/SourceManager.h"
>  #include "clang/Basic/TargetInfo.h"
>  #include "clang/Sema/DeclSpec.h"
> +#include "clang/Sema/LoopHint.h"
>  #include "clang/Sema/PrettyDeclStackTrace.h"
>  #include "clang/Sema/Scope.h"
>  #include "clang/Sema/TypoCorrection.h"
> @@ -357,6 +359,10 @@
>      ProhibitAttributes(Attrs);
>      HandlePragmaMSPragma();
>      return StmtEmpty();
> +
> +  case tok::annot_pragma_loop_hint:
> +    ProhibitAttributes(Attrs);
> +    return ParsePragmaLoopHint(Stmts, OnlyStatement, TrailingElseLoc, Attrs);
>    }
>
>    // If we reached this code, the statement must end in a semicolon.
> @@ -1751,6 +1757,43 @@
>    return Actions.ActOnReturnStmt(ReturnLoc, R.take(), getCurScope());
>  }
>
> +StmtResult Parser::ParsePragmaLoopHint(StmtVector &Stmts,
> +                                       bool OnlyStatement,
> +                                       SourceLocation *TrailingElseLoc,
> +                                       ParsedAttributesWithRange &Attrs)
> +{

Curly brace should follow the previous line.

> +  // Create temporary attribute list.
> +  ParsedAttributesWithRange TempAttrs(AttrFactory);
> +
> +  // Get vectorize hints and consume annotated token.
> +  while (Tok.is(tok::annot_pragma_loop_hint)) {
> +    LoopHint Hint = HandlePragmaLoopHint();
> +    ConsumeToken();
> +
> +    if (!Hint.LoopLoc ||
> +        !Hint.OptionLoc ||
> +        !Hint.ValueLoc)
> +      continue;
> +
> +    ArgsUnion ArgHints[] = {Hint.OptionLoc,
> +                            Hint.ValueLoc,
> +                            ArgsUnion(Hint.ValueExpr)};
> +    TempAttrs.addNew(Hint.LoopLoc->Ident, Hint.Range,
> +                     0, Hint.LoopLoc->Loc,
> +                     ArgHints, 3,
> +                     AttributeList::AS_Keyword);

This should have a FIXME for when we do AttributeList::AS_Pragma.

> +  }
> +
> +  // Get the next statement.
> +  MaybeParseCXX11Attributes(Attrs);
> +
> +  StmtResult S = ParseStatementOrDeclarationAfterAttributes(Stmts,
> +                  /*OnlyStatement*/ true, 0, Attrs);

Shouldn't we be passing the OnlyStatement which was passed into the
function? Same for passing in the TrailingElseLoc instead of 0?

> +
> +  Attrs.takeAllFrom(TempAttrs);
> +  return S;
> +}
> +
>  namespace {
>    class ClangAsmParserCallback : public llvm::MCAsmParserSemaCallback {
>      Parser &TheParser;
> Index: lib/Sema/SemaStmtAttr.cpp
> ===================================================================
> --- lib/Sema/SemaStmtAttr.cpp (revision 208442)
> +++ lib/Sema/SemaStmtAttr.cpp (working copy)
> @@ -14,8 +14,10 @@
>  #include "clang/Sema/SemaInternal.h"
>  #include "clang/AST/ASTContext.h"
>  #include "clang/Basic/SourceManager.h"
> +#include "clang/Lex/Lexer.h"

This isn't required to be included; please remove.

>  #include "clang/Sema/DelayedDiagnostic.h"
>  #include "clang/Sema/Lookup.h"
> +#include "clang/Sema/LoopHint.h"
>  #include "clang/Sema/ScopeInfo.h"
>  #include "llvm/ADT/StringExtras.h"
>
> @@ -42,9 +44,82 @@
>                                             A.getAttributeSpellingListIndex());
>  }
>
> +struct PreviousKind {
> +  int Vectorize;
> +  int Interleave;
> +};
>
> +static Attr *handleLoopHintAttr(Sema &S, Stmt *St, const AttributeList &A,
> +                                SourceRange Range, PreviousKind &Prev) {
> +  if (St->getStmtClass() != Stmt::DoStmtClass &&
> +      St->getStmtClass() != Stmt::ForStmtClass &&
> +      St->getStmtClass() != Stmt::CXXForRangeStmtClass &&
> +      St->getStmtClass() != Stmt::WhileStmtClass) {
> +    S.Diag(St->getLocStart(), diag::err_pragma_loop_precedes_nonloop);
> +    return 0;
> +  }
> +
> +  IdentifierLoc *OptionLoc = A.getArgAsIdent(0);
> +  IdentifierInfo *OptionInfo = OptionLoc->Ident;
> +  IdentifierLoc *ValueLoc = A.getArgAsIdent(1);
> +  IdentifierInfo *ValueInfo = ValueLoc->Ident;
> +  Expr *ValueExpr = A.getArgAsExpr(2);
> +
> +  LoopHintAttr::OptionType Option = LoopHintAttr::Vectorize;
> +  if (OptionInfo && OptionInfo->getName() == "vectorize")
> +    Option = LoopHintAttr::Vectorize;
> +  else if (OptionInfo && OptionInfo->getName() == "interleave")
> +    Option = LoopHintAttr::Interleave;

You should be able to simply assert OptionInfo, since the parser won't
generate an annotation token if the pragma was ill-formed.

> +
> +  LoopHintAttr::KindType Kind = LoopHintAttr::Value;
> +  if (ValueInfo && ValueInfo->getName() == "enable")
> +    Kind = LoopHintAttr::Enable;
> +  else if (ValueInfo && ValueInfo->getName() == "disable")
> +    Kind = LoopHintAttr::Disable;
> +
> +  // We only need to check that a loop hint is compatible with the
> +  // previous loop hint to ensure that all hints are compatible.
> +  if (Option == LoopHintAttr::Vectorize) {
> +    if (Prev.Vectorize != -1 &&
> +      !LoopHintAttr::isCompatible(Prev.Vectorize, Kind)) {
> +      S.Diag(ValueLoc->Loc, diag::err_pragma_loop_incompatible)
> +        << LoopHintAttr::getKindName(Kind)
> +        << LoopHintAttr::getKindName(Prev.Vectorize)
> +        << LoopHintAttr::getOptionName(Option);
> +      return 0;
> +    }
> +    Prev.Vectorize = Kind;
> +  } else if (Option == LoopHintAttr::Interleave) {
> +    if (Prev.Interleave != -1 &&
> +        !LoopHintAttr::isCompatible(Prev.Interleave, Kind)) {
> +      S.Diag(ValueLoc->Loc, diag::err_pragma_loop_incompatible)
> +        << LoopHintAttr::getKindName(Kind)
> +        << LoopHintAttr::getKindName(Prev.Interleave)
> +        << LoopHintAttr::getOptionName(Option);
> +      return 0;
> +    }
> +    Prev.Interleave = Kind;
> +  }

It seems like this could be simplified to remove all of the duplicate
code. Prev doesn't need Interleave and Vectorize to be separate
members since only one is active at a time.

> +
> +  // FIXME: We should support template parameters for the loop hint value.
> +  // See bug report #19610
> +  if (Kind == LoopHintAttr::Value) {
> +    llvm::APSInt ValueAPS;
> +    if(!ValueExpr ||
> +       !ValueExpr->isIntegerConstantExpr(ValueAPS, S.Context) ||
> +       (int)ValueAPS.getSExtValue() < 1) {
> +      S.Diag(ValueLoc->Loc, diag::err_pragma_loop_invalid_value)
> +        << LoopHintAttr::getOptionName(Option);
> +      return 0;
> +    }
> +  }
> +
> +  return LoopHintAttr::CreateImplicit(S.Context,
> +                         Option, Kind, ValueExpr, A.getRange());

Formatting.

> +}

The "return 0" statements in this function should all be "return
nullptr" instead.

> +
>  static Attr *ProcessStmtAttribute(Sema &S, Stmt *St, const AttributeList &A,
> -                                  SourceRange Range) {
> +                                  SourceRange Range, PreviousKind &Prev) {

No, this definitely cannot accept a PreviousKind object. The
handleFooAttr functions all need to have a uniform parameter list
unless there's an exceptional reason against it (I've not gotten
around to harmonizing statement and type attributes the same way I
have with the declaration ones, which is a pain point I wish to avoid
in the future). All of those functions key off the signature for this
one, so handleLoopHintAttr's signature needs to drop the PreviousKind
param as well.

You can't solve this the same way it's solved in SemaDeclAttr.cpp
since you cannot simply ask the statement for its attributes. Instead,
I think ProcessStmtAttributes should take a second pass over the Attrs
vector to test for mutual exclusion, invalid duplicates, etc and allow
for removing attributes as-needed from that list before calling
ActOnAttributedStmt.

>    switch (A.getKind()) {
>    case AttributeList::UnknownAttribute:
>      S.Diag(A.getLoc(), A.isDeclspecAttribute() ?
> @@ -53,6 +128,8 @@
>      return 0;
>    case AttributeList::AT_FallThrough:
>      return handleFallThroughAttr(S, St, A, Range);
> +  case AttributeList::AT_LoopHint:
> +    return handleLoopHintAttr(S, St, A, Range, Prev);
>    default:
>      // if we're here, then we parsed a known attribute, but didn't recognize
>      // it as a statement attribute => it is declaration attribute
> @@ -64,9 +141,14 @@
>
>  StmtResult Sema::ProcessStmtAttributes(Stmt *S, AttributeList *AttrList,
>                                         SourceRange Range) {
> +  PreviousKind Prev;
> +  Prev.Vectorize = -1;
> +  Prev.Interleave = -1;
> +
>    SmallVector<const Attr*, 8> Attrs;
> +  //Note: Attrs are stored in the reverse order than they appear.
>    for (const AttributeList* l = AttrList; l; l = l->getNext()) {
> -    if (Attr *a = ProcessStmtAttribute(*this, S, *l, Range))
> +    if (Attr *a = ProcessStmtAttribute(*this, S, *l, Range, Prev))
>        Attrs.push_back(a);
>    }
>
> Index: test/CodeGen/pragma-loop.cpp
> ===================================================================
> --- test/CodeGen/pragma-loop.cpp (revision 0)
> +++ test/CodeGen/pragma-loop.cpp (working copy)
> @@ -0,0 +1,120 @@
> +// RUN: %clang -std=c++11 -emit-llvm -S -o - %s | FileCheck %s
> +
> +// CHECK: br i1 %cmp, label %while.body, label %while.end, !llvm.loop !1
> +// CHECK: br i1 %cmp, label %do.body, label %do.end, !llvm.loop !5
> +// CHECK: br i1 %cmp, label %for.body, label %for.end, !llvm.loop !7
> +// CHECK: br i1 %cmp, label %for.body, label %for.end, !llvm.loop !8
> +// CHECK: br i1 %cmp, label %for.body, label %for.end, !llvm.loop !11
> +// CHECK: br i1 %cmp, label %for.body, label %for.end, !llvm.loop !13
> +// CHECK: br i1 %cmp, label %for.body, label %for.end, !llvm.loop !14
> +// CHECK: br i1 %cmp, label %for.body, label %for.end, !llvm.loop !16
> +
> +// CHECK: !1 = metadata !{metadata !1, metadata !2, metadata !3, metadata !4}
> +// CHECK: !2 = metadata !{metadata !"llvm.vectorizer.enable", i1 true}
> +// CHECK: !3 = metadata !{metadata !"llvm.vectorizer.unroll", i32 4}
> +// CHECK: !4 = metadata !{metadata !"llvm.vectorizer.width", i32 4}
> +// CHECK: !5 = metadata !{metadata !5, metadata !6, metadata !3}
> +// CHECK: !6 = metadata !{metadata !"llvm.vectorizer.width", i32 8}
> +// CHECK: !7 = metadata !{metadata !7, metadata !2, metadata !3}
> +// CHECK: !8 = metadata !{metadata !8, metadata !9, metadata !10}
> +// CHECK: !9 = metadata !{metadata !"llvm.vectorizer.width", i32 2}
> +// CHECK: !10 = metadata !{metadata !"llvm.vectorizer.unroll", i32 2}
> +// CHECK: !11 = metadata !{metadata !11, metadata !12}
> +// CHECK: !12 = metadata !{metadata !"llvm.vectorizer.width", i32 1}
> +// CHECK: !13 = metadata !{metadata !13, metadata !9, metadata !10}
> +// CHECK: !14 = metadata !{metadata !14, metadata !6, metadata !15}
> +// CHECK: !15 = metadata !{metadata !"llvm.vectorizer.unroll", i32 8}
> +// CHECK: !16 = metadata !{metadata !16, metadata !9, metadata !10}
> +
> +// Verify while loop is recognized after sequence of pragma loop directives.
> +void while_test(int *List, int Length) {
> +  int i = 0;
> +
> +  #pragma loop vectorize(enable)
> +  #pragma loop interleave(4)
> +  #pragma loop vectorize(4)
> +  while(i < Length) {
> +    List[i] = i*2;
> +    i++;
> +  }
> +}
> +
> +// Verify do loop is recognized after multi-option pragma loop directive.
> +void do_test(int *List, int Length) {
> +  int i = 0;
> +
> +  #pragma loop vectorize(8) interleave(4)
> +  do {
> +    List[i] = i*2;
> +    i++;
> +  } while (i < Length);
> +}
> +
> +// Verify for loop is recognized after sequence of pragma loop directives.
> +void for_test(int *List, int Length) {
> +  #pragma loop interleave(enable)
> +  #pragma loop interleave(4)
> +  for(int i = 0; i < Length; i++) {
> +    List[i] = i*2;
> +  }
> +}
> +
> +// Verify c++11 for range loop is recognized after
> +// sequence of pragma loop directives.
> +void for_range_test() {
> +  double List[100];
> +
> +  #pragma loop vectorize(2) interleave(2)
> +  for (int i : List) {
> +    List[i] = i;
> +  }
> +}
> +
> +// Verify disable pragma loop directive generates correct metadata
> +void disable_test(int *List, int Length) {
> +  #pragma loop vectorize(disable)
> +  for(int i = 0; i < Length; i++) {
> +    List[i] = i*2;
> +  }
> +}
> +
> +#define VECWIDTH 2
> +#define INTERLEAVE 2
> +
> +// Verify defines are correctly resolved in pragma loop directive
> +void for_define_test(int *List, int Length, int Value) {
> +  #pragma loop vectorize(VECWIDTH) interleave(INTERLEAVE)
> +  for(int i = 0; i < Length; i++) {
> +    List[i] = i*Value;
> +  }
> +}
> +
> +// Verify metadata is generated when template is used.
> +template <typename A>
> +void for_template_test(A *List, int Length, A Value) {
> +
> +  #pragma loop vectorize(8) interleave(8)
> +  for(int i = 0; i < Length; i++) {
> +    List[i] = i*Value;
> +  }
> +}
> +
> +// Verify define is resolved correctly when template is used.
> +template <typename A>
> +void for_template_define_test(A *List, int Length, A Value) {
> +  #pragma loop vectorize(VECWIDTH) interleave(INTERLEAVE)
> +  for(int i = 0; i < Length; i++) {
> +    List[i] = i*Value;
> +  }
> +}
> +
> +#undef VECWIDTH
> +#undef INTERLEAVE
> +
> +// Use templates defined above. Test verifies metadata is generated correctly.
> +void template_test(double *List, int Length) {
> +  double Value = 10;
> +
> +  for_template_test<double>(List, Length, Value);
> +  for_template_define_test<double>(List, Length, Value);
> +}
> Index: test/PCH/pragma-loop.cpp
> ===================================================================
> --- test/PCH/pragma-loop.cpp (revision 0)
> +++ test/PCH/pragma-loop.cpp (working copy)
> @@ -0,0 +1,59 @@
> +// RUN: %clang_cc1 -emit-pch -o %t.a %s
> +// RUN: %clang_cc1 -include-pch %t.a %s -ast-print -o - | FileCheck %s
> +
> +// CHECK: #pragma loop vectorize(8)
> +// CHECK: #pragma loop interleave(4)
> +// CHECK: #pragma loop vectorize(disable)
> +// CHECK: #pragma loop interleave(enable)
> +// CHECK: #pragma loop vectorize(enable)
> +// CHECK: #pragma loop interleave(disable)
> +
> +#ifndef HEADER
> +#define HEADER
> +
> +class pragma_test {
> +public:
> +  inline void run1(int *List, int Length) {
> +    int i = 0;
> +    #pragma loop vectorize(8)
> +    #pragma loop interleave(4)
> +    while (i < Length) {
> +      List[i] = i;
> +      i++;
> +    }
> +  }
> +
> +  inline void run2(int *List, int Length) {
> +    int i = 0;
> +    #pragma loop vectorize(disable)
> +    #pragma loop interleave(enable)
> +    while (i-1 < Length) {
> +      List[i] = i;
> +      i++;
> +    }
> +  }
> +
> +  inline void run3(int *List, int Length) {
> +    int i = 0;
> +    #pragma loop vectorize(enable)
> +    #pragma loop interleave(disable)
> +    while (i-3 < Length) {
> +      List[i] = i;
> +      i++;
> +    }
> +  }
> +};
> +
> +#else
> +
> +void test() {
> +  int List[100];
> +
> +  pragma_test pt;
> +
> +  pt.run1(List, 100);
> +  pt.run2(List, 100);
> +  pt.run3(List, 100);
> +}
> +
> +#endif
> Index: test/Parser/pragma-loop-ast.cpp
> ===================================================================
> --- test/Parser/pragma-loop-ast.cpp (revision 0)
> +++ test/Parser/pragma-loop-ast.cpp (working copy)
> @@ -0,0 +1,33 @@
> +// RUN: %clang_cc1 -ast-print %s | FileCheck %s
> +
> +// CHECK: #pragma loop vectorize(4)
> +// CHECK: #pragma loop interleave(8)
> +// CHECK: #pragma loop vectorize(enable)
> +// CHECK: #pragma loop interleave(disable)
> +// CHECK: #pragma loop vectorize(disable)
> +// CHECK: #pragma loop interleave(enable)
> +
> +void test(int *List, int Length)
> +{
> +  int i = 0;
> +  #pragma loop vectorize(4)
> +  #pragma loop interleave(8)
> +  while (i < Length) {
> +    List[i] = i * 2;
> +    i++;
> +  }
> +
> +  #pragma loop vectorize(enable)
> +  #pragma loop interleave(disable)
> +  while (i-1 < Length) {
> +    List[i] = i * 2;
> +    i++;
> +  }
> +
> +  #pragma loop vectorize(disable)
> +  #pragma loop interleave(enable)
> +  while (i-2 < Length) {
> +    List[i] = i * 2;
> +    i++;
> +  }
> +}
> Index: test/Parser/pragma-loop.cpp
> ===================================================================
> --- test/Parser/pragma-loop.cpp (revision 0)
> +++ test/Parser/pragma-loop.cpp (working copy)
> @@ -0,0 +1,105 @@
> +// RUN: %clang_cc1 -fsyntax-only -std=c++11 -verify %s
> +
> +// Note that this puts the expected lines before the directives to work around
> +// limitations in the -verify mode.
> +
> +void test(int *List, int Length)
> +{
> +  int i = 0;
> +
> +  #pragma loop vectorize(4)
> +  #pragma loop interleave(8)
> +  while (i+1 < Length) {
> +    List[i] = i;
> +  }
> +
> +  #pragma loop vectorize(enable)
> +  #pragma loop interleave(enable)
> +  while (i < Length) {
> +    List[i] = i;
> +  }
> +
> +  #pragma loop vectorize(disable)
> +  #pragma loop interleave(disable)
> +  while (i-1 < Length) {
> +    List[i] = i;
> +  }
> +
> +  #pragma loop vectorize(4) interleave(8)
> +  while (i-2 < Length) {
> +    List[i] = i;
> +  }
> +
> +  #pragma loop interleave(16)
> +  while (i-3 < Length) {
> +    List[i] = i;
> +  }
> +
> +  int VList[Length];
> +  #pragma loop vectorize(disable) interleave(disable)
> +  for (int j : VList) {
> +    VList[j] = List[j];
> +  }
> +
> +/* expected-error {{expected '('}} */ #pragma loop vectorize
> +/* expected-error {{expected '('}} */ #pragma loop interleave
> +
> +/* expected-error {{expected ')'}} */ #pragma loop vectorize(4
> +/* expected-error {{expected ')'}} */ #pragma loop interleave(4
> +
> +/* expected-error {{missing option in directive '#pragma loop'}} */ #pragma loop
> +/* expected-error {{invalid option 'badkeyword' in directive '#pragma loop'}} */ #pragma loop badkeyword
> +/* expected-error {{invalid option 'badkeyword' in directive '#pragma loop'}} */ #pragma loop badkeyword(2)
> +/* expected-error {{invalid option 'badkeyword' in directive '#pragma loop'}} */ #pragma loop vectorize(4) badkeyword(4)
> +/* expected-warning {{extra tokens at end of '#pragma loop'}} */ #pragma loop vectorize(4) ,
> +
> +  while (i-4 < Length) {
> +    List[i] = i;
> +  }
> +
> +/* expected-error {{expected a positive integer in directive '#pragma loop vectorize'}} */ #pragma loop vectorize(0)
> +/* expected-error {{expected a positive integer in directive '#pragma loop interleave'}} */ #pragma loop interleave(0)
> +  while (i-5 < Length) {
> +    List[i] = i;
> +  }
> +
> +/* expected-error {{expected a positive integer in directive '#pragma loop vectorize'}} */ #pragma loop vectorize(3000000000)
> +/* expected-error {{expected a positive integer in directive '#pragma loop interleave'}} */ #pragma loop interleave(3000000000)
> +  while (i-6 < Length) {
> +    List[i] = i;
> +  }
> +
> +/* expected-error {{expected a positive integer in directive '#pragma loop vectorize'}} */ #pragma loop vectorize(badvalue)
> +/* expected-error {{expected a positive integer in directive '#pragma loop interleave'}} */ #pragma loop interleave(badvalue)
> +  while (i-7 < Length) {
> +    List[i] = i;
> +  }
> +
> +#pragma loop vectorize(enable)
> +/* expected-error {{expected a for, while, or do-while loop to follow the '#pragma loop' directive}} */ int j = Length;
> +  List[0] = List[1];
> +
> +  while (j-1 < Length) {
> +    List[j] = j;
> +  }
> +
> +/* expected-error {{'value' and 'disable' directive option types are incompatible in '#pragma loop vectorize'}} */ #pragma loop vectorize(4)
> +  #pragma loop vectorize(disable)
> +  while (i-8 < Length) {
> +    List[i] = i;
> +  }
> +
> +/* expected-error {{'enable' and 'disable' directive option types are incompatible in '#pragma loop interleave'}} */ #pragma loop interleave(enable)
> + #pragma loop interleave(disable)
> +  while (i-9 < Length) {
> +    List[i] = i;
> +  }
> +
> +/* expected-error {{'disable' and 'value' directive option types are incompatible in '#pragma loop vectorize'}} */ #pragma loop vectorize(disable)
> +  #pragma loop vectorize(4)
> +  while (i-10 < Length) {
> +    List[i] = i;
> +  }
> +
> +#pragma loop interleave(enable)
> +/* expected-error {{expected statement}} */ }
>

Btw, when I test your patch locally, I get failed assertions from the
STL. "array iterator + offset out of range" on a call to std::copy
within ASTStmtReader::VisitAttributedStmt().

~Aaron