[PATCH] D9403: llvm.noalias - Clang CodeGen for local restrict-qualified pointers

John McCall via cfe-commits cfe-commits at lists.llvm.org
Wed Oct 12 23:43:09 PDT 2016


rjmccall added inline comments.


================
Comment at: lib/CodeGen/CGDecl.cpp:1291
+// where the pointer to the local variable is the key in the map.
+void CodeGenFunction::EmitAutoVarNoAlias(const AutoVarEmission &emission) {
+  assert(emission.Variable && "emission was not valid!");
----------------
This should really be a subroutine of EmitAutoVarAlloca; that will implicitly pick this logic up for a bunch of less standard places in IRGen that create local variables.  Please make it a static function (if possible) called EmitNoAliasScope and call it immediately after EmitVarAnnotations.


================
Comment at: lib/CodeGen/CGStmt.cpp:525
+void CodeGenFunction::LexicalNoAliasInfo::addNoAliasMD() {
+  if (MemoryInsts.empty() || NoAliasScopes.empty())
+    return;
----------------
hfinkel wrote:
> rjmccall wrote:
> > It's much more likely that NoAliasScopes will be empty than that MemoryInsts will be empty.  You should probably fast-path using that, or better yet, with the RecordMemoryInsts bit.
> I'm not sure that's true; we only record memory-accessing instructions in the first place if there are relevant restrict-qualified pointers around.
Ok, that makes sense.


================
Comment at: lib/CodeGen/CGStmt.cpp:537
+                                      llvm::LLVMContext::MD_noalias),
+                                      NewScopeList));
+
----------------
rjmccall wrote:
> hfinkel wrote:
> > rjmccall wrote:
> > > This is a very strange representation.  Every memory operation in the lexical block is annotated with a list of all of the scopes that were entered within the block, even if they were entered after the operation.  But for some reason, not with nested scopes?
> > > 
> > > What's the right patch for me to read about this representation?
> > Perhaps unfortunately, this is an artifact of the way that restrict is defined in C. It applies to all accesses in the block in which the variable is declared, even those before the declaration of the restrict-qualified local itself.
> > 
> > It should work with nested scopes, in the sense that we add these things as we complete each scope. So we add things to the inner scope, and then when we complete the outer scope, we go back over the instructions (including those in the inner scope because the scope recording recurses up the scope hierarchy), and adds the outer scopes - it concatenates them to any added by the inner (nested) scopes.
> > 
> > The noalias intrinsic's LangRef updates are in D9375.
> Ah, I see the recursion now.  Please add a comment explaining that expectation here.
I would still like this comment.  It should be enough to explain how MemoryInsts actually gets filled: there's a callback from inserting an instruction which adds all memory instructions to every active lexical scope that's recording them.


================
Comment at: lib/CodeGen/CodeGenFunction.cpp:1986
+  if (I->mayReadOrWriteMemory())
+    recordMemoryInstruction(I);
 }
----------------
This condition will include calls, which I assume is unnecessary (except maybe memory intrinsics?).  More seriously, it will also include loads and stores into allocas, which will sweep in all sorts of bookkeeping temporaries that IRGen uses in more complex code-generation situations.  You might want to consider ways to avoid recording this kind of instruction that are very likely to just get immediately optimized away by e.g. mem2reg.

Please also modify LexicalScope so that it records whether there's any active recording scope in the CodeGenFunction.  Lexical scopes are nested, so that should be as easy as saving the current state when you enter a scope and restoring it when you leave.  That will allow you to optimize this code by completely bypassing the recursion and even the check for whether the instruction is a memory instruction in the extremely common case of a function with no restrict variables.

In general, a lot of things about this approach seem to have worryingly poor asymptotic performance in the face of large functions or (especially) many restrict variables.  (You could, for example, have chosen to have each memory instruction link to its enclosing noalias scope, which would contain a link to its enclosing scope and a list of the variables it directly declares.  This would be a more complex representation to consume, but it would not require inherently quadratic frontend work building all these lists.)  The only saving grace is that we expect very little code to using restrict, so it becomes vital to ensure that your code is immediately short-circuited when nothing is using restrict.


================
Comment at: lib/CodeGen/CodeGenFunction.h:597
+    // those blocks) so that we can later add the appropriate metadata. Record
+    // this instruction and so the same in any parent scopes.
+    void recordMemoryInstruction(llvm::Instruction *I) {
----------------
I would suggest this wording:

  /// Record the given memory access instruction in this and all of its enclosing scopes.
  /// When we close this scope, we'll add the list of all the restrict-qualified local variables
  /// it declared to all the memory accesses which occurred within it.  Recording is only
  /// enabled under optimization, and it's disabled in a scope unless it actually declares
  /// any local restrict variables.


https://reviews.llvm.org/D9403





More information about the cfe-commits mailing list