[PATCH] #pragma vectorize

Tyler Nowicki tnowicki at apple.com
Fri May 30 17:21:14 PDT 2014


Hi Richard,

Thanks for the reviews on these patches. I applied most of your suggestions and my comments/concerns on the other suggestions are below. I’ve also included your comments from the non-type template parameter thread here because they refer to elements of this patch.

Here is the updated patch.

Thanks again!

Tyler



> [...]
> +  let Documentation = [Undocumented];
> 
> Please add documentation for this (doesn't have to be part of this patch, but we should document this before we consider it "done”).\

I have a documentation patch in progress. I will add attribute documentation as well.


> --- lib/Parse/ParsePragma.cpp	(revision 209824)
> +++ lib/Parse/ParsePragma.cpp	(working copy)
> [...]
> +  PP.AddPragmaHandler(LoopHintHandler.get());
> 
> This should be 
> 
>   PP.AddPragmaHandler("clang", LoopHintHandler.get());


Specifying the clang namespace changes the syntax to #pragma clang loop <options>. The syntax was chosen because it is not specific to any llvm optimization but rather a generic statement about the desired/optimal formulation of the loop. I think that was the rational for not including it in the clang group to being with. What do you think?


> 
> --- lib/CodeGen/CGStmt.cpp	(revision 209824)
> +++ lib/CodeGen/CGStmt.cpp	(working copy)
> [...]
> +    if (Arg == LoopHintAttr::Enable) {
> +      Name = llvm::MDString::get(Context, "llvm.vectorizer.enable");
> +      Value = Builder.getTrue();
> 
> I find this surprising. Should these both really be semantically identical:

No, they shouldn’t be. Unfortunately this is the current state of the metadata. I have it on my list of things to do to fix the metadata so that it can be used to enable vectorization and interleaving separately.

Unfortunately, the vectorization metadata is already in use in clang, so any changes would need to be made simultaneously in both llvm and clang. What is the process for making changes like this?


> [...]
> +    PragmaLoopHintInfo *Info =
> +        (PragmaLoopHintInfo *)PP.getPreprocessorAllocator().Allocate(
> +            sizeof(PragmaLoopHintInfo), llvm::alignOf<PragmaLoopHintInfo>());
> 
> Can you use just
> 
>   auto *Info = new (PP.getPreprocessorAllocator()) PragmaLoopHintInfo;
> 
> ?

Allocate is to make sure that the structure is deallocated when the preprocessor is destructed. Otherwise you’ll notice there is no free for the Info. It seems to be a standard way of allocating data for annotated tokens.


> 
> --- lib/Sema/SemaStmtAttr.cpp	(revision 209824)
> +++ lib/Sema/SemaStmtAttr.cpp	(working copy)
> [...]
> +  if (St->getStmtClass() != Stmt::DoStmtClass &&
> +      St->getStmtClass() != Stmt::ForStmtClass &&
> +      St->getStmtClass() != Stmt::CXXForRangeStmtClass &&
> +      St->getStmtClass() != Stmt::WhileStmtClass) {
> 
> Do you intentionally not handle ObjCForCollectionStmt here? (I suppose they're very unlikely to be vectorizable?)

I am not that familiar with objective C. This may sound like a dumb question but does objective C allow pragmas? If so I will look at it for a later patch.


> You’ve suggested copying HandlePragmaMSPragma
> 
> Right, yes, you're on the right lines. What I'm suggesting is that you grab the token sequence between the parentheses and store that until you get to parsing the annotation. Then put them back into the token stream and use the normal expression parsing machinery to parse the expression. That's the part of HandlePragmaMSPragma's behavior that I was referring to. (I can see how that was unclear, sorry!)
> 
> Once you've done that, you shouldn't need any custom template instantiation logic.

I don’t think MSPragma works that way. In HandlePragmaMSPragma it removes the tokens from the annotated token and puts them onto the token stream. Following this it examines the pragma name and calls a handler that immediately lexes the same tokens f back from the token stream. The handler then calls an ActOn which sets some values in sema.

From the spec it seems that the only parameter to an ms pragma that could take a non-type template parameter would be the section name. But this is parsed as a string literal in the parser. I don’t think MSPragma can be copied to implement non-type template parameters.

Correct me if I am wrong but template instantiation occurs after the object (class, struct, function, etc) has been parsed and is in the AST. Later when an instantiation is identified it traverses the template's AST and performs a transformation which duplicates each object substituting template parameters for known parameters. Note the Rebuild* methods in lib/Sema/SemaTemplateInstantiate.cpp which create build new expressions, statements, and declarations.


> From a high level, I think the syntax of this pragma could be improved. Instead of accepting an integer literal (or eventually a template argument), I would think this should accept an arbitrary integer constant expression. It's not reasonable to require people to put their loops into a template in order to specify a width that isn't an integer literal. Also, having an argument that can either be an identifier or a pseudokeyword ('enable' or 'disable') seems like an unfortunate design choice. I'm OK with you going ahead with the current design on the understanding that we'll keep discussing it and may change it to something better.
…
> Off the top of my head:
> 
> #pragma clang loop vectorize(enable, <n>)

The reason I like putting them together is that vectorize(enable, 1) is actually a contradiction  enabled indicates vectorization and a vector width of 1 indicates scalar instructions or no vectorization. This contradiction could easily be produced when the programmer decides to completely disable vectorization by setting a define, non-type template parameter, or constant expression to 1. Merging Enabled and Value prevents this contradiction from occurring. vectorize(disable) or vectorize(1) means do not vectorize, vectorize(enable) means vectorize with any width, and vectorize(4) means vectorize with width 4. So there can be no contradictions here.

Perhaps '#pragma loop vectorize’ without additional parameters could replace the current behavior of enable, and the documentation could instruct the programmer to use ‘#pragma loop vectorize(0 or 1)’ for the disable case? Does it seem weird that ‘vectorize’ means enable and ‘vectorize(1)’ means disable?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20140530/c135716c/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pragma_loop-svn.patch
Type: application/octet-stream
Size: 40862 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20140530/c135716c/attachment.obj>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20140530/c135716c/attachment-0001.html>


More information about the cfe-commits mailing list