[PATCH] #pragma vectorize

Fri May 30 19:53:50 PDT 2014

On Fri, May 30, 2014 at 5:21 PM, Tyler Nowicki <tnowicki at apple.com> wrote:

> Hi Richard,
>
> Thanks for the reviews on these patches. I applied most of your
> suggestions and my comments/concerns on the other suggestions are below.
> I’ve also included your comments from the non-type template parameter
> thread here because they refer to elements of this patch.
>
> Here is the updated patch.
>
> Thanks again!
>
> Tyler
>
>
>
> [...]
> +  let Documentation = [Undocumented];
>
> Please add documentation for this (doesn't have to be part of this patch,
> but we should document this before we consider it "done”).\
>
>
> I have a documentation patch in progress. I will add attribute
> documentation as well.
>
>
> --- lib/Parse/ParsePragma.cpp (revision 209824)
> +++ lib/Parse/ParsePragma.cpp (working copy)
> [...]
> +  PP.AddPragmaHandler(LoopHintHandler.get());
>
> This should be
>
>   PP.AddPragmaHandler("clang", LoopHintHandler.get());
>
>
> Specifying the clang namespace changes the syntax to #pragma clang loop
> <options>. The syntax was chosen because it is not specific to any llvm
> optimization but rather a generic statement about the desired/optimal
> formulation of the loop. I think that was the rational for not including it
> in the clang group to being with. What do you think?
>

It's not acceptable for us to invent a new pragma that's not in the clang
namespace. This is the informal agreement between compiler vendors -- we
each have our own namespace, and that's where we put our new pragmas.

--- lib/CodeGen/CGStmt.cpp (revision 209824)
> +++ lib/CodeGen/CGStmt.cpp (working copy)
> [...]
> +    if (Arg == LoopHintAttr::Enable) {
> +      Name = llvm::MDString::get(Context, "llvm.vectorizer.enable");
> +      Value = Builder.getTrue();
>
> I find this surprising. Should these both really be semantically identical:
>
>
> No, they shouldn’t be. Unfortunately this is the current state of the
> metadata. I have it on my list of things to do to fix the metadata so that
> it can be used to enable vectorization and interleaving separately.
>
> Unfortunately, the vectorization metadata is already in use in clang, so
> any changes would need to be made simultaneously in both llvm and clang.
> What is the process for making changes like this?
>

Clang and LLVM are revision-locked: commit to both of them at
(approximately) the same time, and that's fine. (A single revision where
things are broken because the changes have only landed on one side is not a
major problem, and happens periodically when people make these sorts of
changes.)

[...]
> +    PragmaLoopHintInfo *Info =
> +        (PragmaLoopHintInfo *)PP.getPreprocessorAllocator().Allocate(
> +            sizeof(PragmaLoopHintInfo),
> llvm::alignOf<PragmaLoopHintInfo>());
>
> Can you use just
>
>   auto *Info = new (PP.getPreprocessorAllocator()) PragmaLoopHintInfo;
>
> ?
>
>
> Allocate is to make sure that the structure is deallocated when the
> preprocessor is destructed. Otherwise you’ll notice there is no free for
> the Info. It seems to be a standard way of allocating data for annotated
> tokens.
>

The mechanism I suggested uses the same allocator, it just does so in a
less verbose way.

--- lib/Sema/SemaStmtAttr.cpp (revision 209824)
> +++ lib/Sema/SemaStmtAttr.cpp (working copy)
> [...]
> +  if (St->getStmtClass() != Stmt::DoStmtClass &&
> +      St->getStmtClass() != Stmt::ForStmtClass &&
> +      St->getStmtClass() != Stmt::CXXForRangeStmtClass &&
> +      St->getStmtClass() != Stmt::WhileStmtClass) {
>
> Do you intentionally not handle ObjCForCollectionStmt here? (I suppose
> they're very unlikely to be vectorizable?)
>
>
> I am not that familiar with objective C. This may sound like a dumb
> question but does objective C allow pragmas? If so I will look at it for a
> later patch.
>

Objective C is a proper superset of C, so yes. I don't know Objective C
very well either, and it might be the case that it would never be possible
to vectorize its for loop, because it's too dynamic.

You’ve suggested copying HandlePragmaMSPragma
>>
>
> Right, yes, you're on the right lines. What I'm suggesting is that you
> grab the token sequence between the parentheses and store that until you
> get to parsing the annotation. Then put them back into the token stream and
> use the normal expression parsing machinery to parse the expression. That's
> the part of HandlePragmaMSPragma's behavior that I was referring to. (I can
> see how that was unclear, sorry!)
>
> Once you've done that, you shouldn't need any custom template
> instantiation logic.
>
>
> I don’t think MSPragma works that way. In HandlePragmaMSPragma it removes
> the tokens from the annotated token and puts them onto the token stream.
> Following this it examines the pragma name and calls a handler that
> immediately lexes the same tokens f back from the token stream. The handler
> then calls an ActOn which sets some values in sema.
>

We still seem to be talking past each other, because you described exactly
what I was describing...

>From the spec it seems that the only parameter to an ms pragma that could
> take a non-type template parameter would be the section name. But this is
> parsed as a string literal in the parser. I don’t think MSPragma can be
> copied to implement non-type template parameters.
>

It uses ParseStringLiteralExpression, which might parse multiple tokens.
But the point is, it's able to use parser functionality to parse arbitrary
kinds of expressions, which is what you want. I think you're being misled
by thinking about non-type template parameters here; as I've said
elsewhere, it'd be extremely odd to accept them but to not accept arbitrary
integer constant expressions. So what I think you want (and what the
approach I described above permits) is to call ParseConstantExpression.

Correct me if I am wrong but template instantiation occurs after the object
> (class, struct, function, etc) has been parsed and is in the AST. Later
> when an instantiation is identified it traverses the template's AST and
> performs a transformation which duplicates each object substituting
> template parameters for known parameters. Note the Rebuild* methods in
> lib/Sema/SemaTemplateInstantiate.cpp which create build new expressions,
> statements, and declarations.
>
>
> From a high level, I think the syntax of this pragma could be improved.
> Instead of accepting an integer literal (or eventually a template
> argument), I would think this should accept an arbitrary integer constant
> expression. It's not reasonable to require people to put their loops into a
> template in order to specify a width that isn't an integer literal. Also,
> having an argument that can either be an identifier or a pseudokeyword
> ('enable' or 'disable') seems like an unfortunate design choice. I'm OK
> with you going ahead with the current design on the understanding that
> we'll keep discussing it and may change it to something better.
>
> …
>
> Off the top of my head:
>
> #pragma clang loop vectorize(enable, <n>)
>
>
> The reason I like putting them together is that vectorize(enable, 1) is
> actually a contradiction  enabled indicates vectorization and a vector
> width of 1 indicates scalar instructions or no vectorization. This
> contradiction could easily be produced when the programmer decides to
> completely disable vectorization by setting a define, non-type template
> parameter, or constant expression to 1. Merging Enabled and Value prevents
> this contradiction from occurring. vectorize(disable) or vectorize(1) means
> do not vectorize, vectorize(enable) means vectorize with any width, and
> vectorize(4) means vectorize with width 4. So there can be no
> contradictions here.
>

vectorize(enable) might mean "vectorize" or might mean "vectorize with the
width specified by the expression 'enable'", so it's ambiguous. And
vectorize(1) meaning "do not vectorize" is quite confusing (though, sure,
it's consistent with the general meaning of numbers in this context).

Perhaps '#pragma loop vectorize’ without additional parameters could
> replace the current behavior of enable, and the documentation could
> instruct the programmer to use ‘#pragma loop vectorize(0 or 1)’ for the
> disable case? Does it seem weird that ‘vectorize’ means enable and
> ‘vectorize(1)’ means disable?
>

As noted above, it seems weird that vectorize(1) means 'disable'. Allowing
vectorize(0) to also mean 'disable' would be an improvement, I think.

Another possible approach would be to use something akin to named arguments:

  #pragma clang loop vectorize(disable)
  #pragma clang loop vectorize(width: 4)
  #pragma clang loop vectorize(enable, width: 4,
some_new_thing_weve_not_though_of_yet: 8)

... or to use a different name for the width:

  #pragma clang loop vectorize(enable) vectorize_width(4)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20140530/3f64a8af/attachment.html>