<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Fri, May 30, 2014 at 5:21 PM, Tyler Nowicki <span dir="ltr"><<a href="mailto:tnowicki@apple.com" target="_blank">tnowicki@apple.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div style="word-wrap:break-word"><div>Hi Richard,</div><div>
<br></div><div>Thanks for the reviews on these patches. I applied most of your suggestions and my comments/concerns on the other suggestions are below. I’ve also included your comments from the non-type template parameter thread here because they refer to elements of this patch.</div>
<div class=""><div><br></div><div>Here is the updated patch.</div><div><br></div></div><div>Thanks again!</div><span class=""><font color="#888888"><div><br></div><div>Tyler</div><div><br></div><div></div></font></span></div>
<br><div style="word-wrap:break-word"><div></div><div><br></div><div><blockquote type="cite"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div>[...]</div><div><div>+ let Documentation = [Undocumented];<br>
</div><div><br></div><div>Please add documentation for this (doesn't have to be part of this patch, but we should document this before we consider it "done”).\</div></div></div></div></div></blockquote><div><br></div>
<div>I have a documentation patch in progress. I will add attribute documentation as well.</div><div><br></div><div><br></div><div><blockquote type="cite"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div class="gmail_extra">
<div class="gmail_extra"><div class="gmail_extra"><div class="gmail_extra"><div><div>--- lib/Parse/ParsePragma.cpp<span style="white-space:pre-wrap"> </span>(revision 209824)</div><div>+++ lib/Parse/ParsePragma.cpp<span style="white-space:pre-wrap"> </span>(working copy)</div>
</div><div>[...]</div><div>+ PP.AddPragmaHandler(LoopHintHandler.get());<br></div><div><br></div><div>This should be </div><div><br></div><div> PP.AddPragmaHandler("clang", LoopHintHandler.get());</div></div></div>
</div></div></div></div></div></blockquote></div><div><br></div><div>Specifying the clang namespace changes the syntax to #pragma clang loop <options>. The syntax was chosen because it is not specific to any llvm optimization but rather a generic statement about the desired/optimal formulation of the loop. I think that was the rational for not including it in the clang group to being with. What do you think?</div>
</div></div></blockquote><div><br></div><div>It's not acceptable for us to invent a new pragma that's not in the clang namespace. This is the informal agreement between compiler vendors -- we each have our own namespace, and that's where we put our new pragmas.</div>
<div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div style="word-wrap:break-word"><div><blockquote type="cite">
<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div><div class="gmail_extra"><div class="gmail_extra">
<div class="gmail_extra"><div class="gmail_extra">--- lib/CodeGen/CGStmt.cpp<span style="white-space:pre-wrap"> </span>(revision 209824)</div><div class="gmail_extra">+++ lib/CodeGen/CGStmt.cpp<span style="white-space:pre-wrap"> </span>(working copy)</div>
<div class="gmail_extra">[...]</div><div class="gmail_extra"><div class="gmail_extra">+ if (Arg == LoopHintAttr::Enable) {</div><div class="gmail_extra">+ Name = llvm::MDString::get(Context, "llvm.vectorizer.enable");</div>
<div class="gmail_extra">+ Value = Builder.getTrue();</div><div><br></div></div><div class="gmail_extra"><div>I find this surprising. Should these both really be semantically identical:</div></div></div></div></div>
</div></div></div></div></blockquote><div><br></div><div>No, they shouldn’t be. Unfortunately this is the current state of the metadata. I have it on my list of things to do to fix the metadata so that it can be used to enable vectorization and interleaving separately.</div>
<div><br></div><div>Unfortunately, the vectorization metadata is already in use in clang, so any changes would need to be made simultaneously in both llvm and clang. What is the process for making changes like this?</div>
</div></div></blockquote><div><br></div><div>Clang and LLVM are revision-locked: commit to both of them at (approximately) the same time, and that's fine. (A single revision where things are broken because the changes have only landed on one side is not a major problem, and happens periodically when people make these sorts of changes.)</div>
<div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div style="word-wrap:break-word"><div><blockquote type="cite">
<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div><div class="gmail_extra"><div class="gmail_extra"><div class="gmail_extra"><div class="gmail_extra">
<div>[...]</div><div><div>+ PragmaLoopHintInfo *Info =</div><div>+ (PragmaLoopHintInfo *)PP.getPreprocessorAllocator().Allocate(</div>
<div>+ sizeof(PragmaLoopHintInfo), llvm::alignOf<PragmaLoopHintInfo>());</div></div><div><br></div><div>Can you use just</div><div><br></div><div> auto *Info = new (PP.getPreprocessorAllocator()) PragmaLoopHintInfo;</div>
<div><br></div><div>?</div></div></div></div></div></div></div></div></div></blockquote><div><br></div><div>Allocate is to make sure that the structure is deallocated when the preprocessor is destructed. Otherwise you’ll notice there is no free for the Info. It seems to be a standard way of allocating data for annotated tokens.</div>
</div></div></blockquote><div><br></div><div>The mechanism I suggested uses the same allocator, it just does so in a less verbose way.</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div style="word-wrap:break-word"><div><blockquote type="cite"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div><div class="gmail_extra"><div class="gmail_extra"><div class="gmail_extra"><div class="gmail_extra">
<div><div>--- lib/Sema/SemaStmtAttr.cpp<span style="white-space:pre-wrap"> </span>(revision 209824)</div><div>+++ lib/Sema/SemaStmtAttr.cpp<span style="white-space:pre-wrap"> </span>(working copy)</div>
</div><div>[...]</div><div><div>+ if (St->getStmtClass() != Stmt::DoStmtClass &&</div><div>+ St->getStmtClass() != Stmt::ForStmtClass &&</div><div>+ St->getStmtClass() != Stmt::CXXForRangeStmtClass &&</div>
<div>+ St->getStmtClass() != Stmt::WhileStmtClass) {</div></div><div><br></div><div>Do you intentionally not handle ObjCForCollectionStmt here? (I suppose they're very unlikely to be vectorizable?)</div></div>
</div></div></div></div></div></div></div></blockquote><div><br></div><div>I am not that familiar with objective C. This may sound like a dumb question but does objective C allow pragmas? If so I will look at it for a later patch.</div>
</div></div></blockquote><div><br></div><div>Objective C is a proper superset of C, so yes. I don't know Objective C very well either, and it might be the case that it would never be possible to vectorize its for loop, because it's too dynamic.</div>
<div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div style="word-wrap:break-word"><div><div>
<blockquote type="cite"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div style="word-wrap:break-word"><div>You’ve suggested copying HandlePragmaMSPragma</div></div></blockquote><div><br></div><div>Right, yes, you're on the right lines. What I'm suggesting is that you grab the token sequence between the parentheses and store that until you get to parsing the annotation. Then put them back into the token stream and use the normal expression parsing machinery to parse the expression. That's the part of HandlePragmaMSPragma's behavior that I was referring to. (I can see how that was unclear, sorry!)</div>
<div><br></div><div>Once you've done that, you shouldn't need any custom template instantiation logic.</div></div></div></div></blockquote><div><br></div><div>I don’t think MSPragma works that way. In HandlePragmaMSPragma it removes the tokens from the annotated token and puts them onto the token stream. Following this it examines the pragma name and calls a handler that immediately lexes the same tokens f back from the token stream. The handler then calls an ActOn which sets some values in sema.</div>
</div></div></div></blockquote><div><br></div><div>We still seem to be talking past each other, because you described exactly what I was describing...</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div style="word-wrap:break-word"><div><div><div>From the spec it seems that the only parameter to an ms pragma that could take a non-type template parameter would be the section name. But this is parsed as a string literal in the parser. I don’t think MSPragma can be copied to implement non-type template parameters.</div>
</div></div></div></blockquote><div><br></div><div>It uses ParseStringLiteralExpression, which might parse multiple tokens. But the point is, it's able to use parser functionality to parse arbitrary kinds of expressions, which is what you want. I think you're being misled by thinking about non-type template parameters here; as I've said elsewhere, it'd be extremely odd to accept them but to not accept arbitrary integer constant expressions. So what I think you want (and what the approach I described above permits) is to call ParseConstantExpression.</div>
<div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div style="word-wrap:break-word"><div><div>
<div>Correct me if I am wrong but template instantiation occurs after the object (class, struct, function, etc) has been parsed and is in the AST. Later when an instantiation is identified it traverses the template's AST and performs a transformation which duplicates each object substituting template parameters for known parameters. Note the Rebuild* methods in lib/Sema/SemaTemplateInstantiate.cpp which create build new expressions, statements, and declarations.</div>
<div><br></div><div><br></div><blockquote type="cite"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">From a high level, I think the syntax of this pragma could be improved. Instead of accepting an integer literal (or eventually a template argument), I would think this should accept an arbitrary integer constant expression. It's not reasonable to require people to put their loops into a template in order to specify a width that isn't an integer literal. Also, having an argument that can either be an identifier or a pseudokeyword ('enable' or 'disable') seems like an unfortunate design choice. I'm OK with you going ahead with the current design on the understanding that we'll keep discussing it and may change it to something better.</div>
</div></div></blockquote>…<br><blockquote type="cite"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div>Off the top of my head:</div><div><br></div><div>#pragma clang loop vectorize(enable, <n>)</div>
</div></div></div></blockquote><br></div><div>The reason I like putting them together is that vectorize(enable, 1) is actually a contradiction enabled indicates vectorization and a vector width of 1 indicates scalar instructions or no vectorization. This contradiction could easily be produced when the programmer decides to completely disable vectorization by setting a define, non-type template parameter, or constant expression to 1. Merging Enabled and Value prevents this contradiction from occurring. vectorize(disable) or vectorize(1) means do not vectorize, vectorize(enable) means vectorize with any width, and vectorize(4) means vectorize with width 4. So there can be no contradictions here.</div>
</div></div></blockquote><div><br></div><div>vectorize(enable) might mean "vectorize" or might mean "vectorize with the width specified by the expression 'enable'", so it's ambiguous. And vectorize(1) meaning "do not vectorize" is quite confusing (though, sure, it's consistent with the general meaning of numbers in this context).</div>
<div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div style="word-wrap:break-word"><div><div>
Perhaps '#pragma loop vectorize’ without additional parameters could replace the current behavior of enable, and the documentation could instruct the programmer to use ‘#pragma loop vectorize(0 or 1)’ for the disable case? Does it seem weird that ‘vectorize’ means enable and ‘vectorize(1)’ means disable?</div>
</div></div></blockquote><div><br></div><div>As noted above, it seems weird that vectorize(1) means 'disable'. Allowing vectorize(0) to also mean 'disable' would be an improvement, I think.</div><div><br></div>
<div>Another possible approach would be to use something akin to named arguments:</div><div><br></div><div> #pragma clang loop vectorize(disable)</div><div> #pragma clang loop vectorize(width: 4)</div><div> #pragma clang loop vectorize(enable, width: 4, some_new_thing_weve_not_though_of_yet: 8)<br>
</div><div><br></div><div>... or to use a different name for the width:</div><div><br></div><div> #pragma clang loop vectorize(enable) vectorize_width(4)<br></div></div></div></div>