[PATCH] D82470: [OpenMP][IRBuilder] Support allocas in nested parallel regions

Johannes Doerfert via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jun 29 17:13:24 PDT 2020


jdoerfert marked an inline comment as done.
jdoerfert added inline comments.


================
Comment at: llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h:282
+  IRBuilder<> AllocaBuilder;
+
   /// Map to remember source location strings
----------------
fghanim wrote:
> jdoerfert wrote:
> > fghanim wrote:
> > > jdoerfert wrote:
> > > > fghanim wrote:
> > > > > What's the benefit of this over just maintaining an alloca insertion point?
> > > > > 
> > > > > With this we will have 3 `IRBuilder`s to maintain and keep track of: the clang (or flang) `IRBuilder`, the OMP `IRBuilder` and the `allocaBuilder`
> > > > No functional difference. Basically the same reason why clang has an AllocIRBuilder (I think), we can use it independently of the other one. The alternative is to save the builder IP, set it to the alloca IP, do alloca stuff, reset the builder IP, when you use the AllocaIRBuilder below. Again, no functional difference, just less setting/resetting of IPs.
> > > I understand where you are coming from, and I understand that functionally there is mostly no change - for now. But I prefer we do a small struct with some RAII, over adding an entire builder. I think it's a bit overkill and will be a source to lots of trouble.
> > > 
> > > Clang maintains an instruction `AllocaInsertPt`, not a specific builder. Speaking of which, This is completely from memory (i.e. check to make sure); I think we already save the previous `AllocaInsertionPt` in clang as part of the `bodyGenCB`. In which case, this is not needed; a nested `parallel` will always be generated by clang as part of the `bodyGenCB`, which in turn is going to save the Alloca insertion point for us. 
> > > To enable the creation of the allocas for `TID` and `ZeroAddr` in the outer function, Why not pass current `AllocaInsertionPt` as an arg. to `createParallel`
> > > I understand where you are coming from, and I understand that functionally there is mostly no change - for now. But I prefer we do a small struct with some RAII, over adding an entire builder. I think it's a bit overkill and will be a source to lots of trouble.
> > 
> > Could you explain what kind of overkill you see here? And maybe also the lots of trouble you expect from a builder versus a insertion point? It is the same information, just packed in a different struct, right?
> > 
> > 
> > > Clang maintains an instruction AllocaInsertPt, not a specific builder. 
> > 
> > Right.
> > 
> > 
> > > Speaking of which, This is completely from memory (i.e. check to make sure); I think we already save the previous AllocaInsertionPt in clang as part of the bodyGenCB. 
> > 
> > I think so.
> > 
> > 
> > > In which case, this is not needed; 
> > 
> > Not strictly, no. 
> > 
> > > a nested parallel will always be generated by clang as part of the bodyGenCB,
> > 
> > Yes.
> > 
> > > which in turn is going to save the Alloca insertion point for us.
> > 
> > Yes.
> > 
> > > To enable the creation of the allocas for TID and ZeroAddr in the outer function, Why not pass current AllocaInsertionPt as an arg. to createParallel
> > 
> > So far, because this will not be the only one that needs to create allocas. If you want to add the allocaIP to all runtime calls that create allocas, I'm fine with that too.
> > Could you explain what kind of overkill you see here?
> Using an IRBuilder when an insertion point suffices
> 
> > And maybe also the lots of trouble you expect from a builder versus a insertion point?
> Currently, when using the OMPBuilder you need to juggle two IRBuilders, let's not make them 3 :)
> 
> > It is the same information, just packed in a different struct, right?
> no it is not. `InsetrionPoint` vs  struct that contains `Insertionpoint` + other things. But that's beside the point
> 
> I mean, is the `AllocaBuilder` providing a new functionality other than a convenient way to keep track of the Alloca `InsertionPoint`?
> Personally, I prefer some suitable structure to your needs and a stack (i.e. the way clang does it) . This should also resolve the todo below. Alternatively, pass an allocation point as an argument as suggested earlier. I am open to any third way if you have any.
> Personally, I prefer some suitable structure to your needs and a stack (i.e. the way clang does it).

The stack is already here, implicitly, and actually on the stack. With
`IRBuilder<>::InsertPointGuard AIPG(AllocaBuilder);`
we have all the stack we need without leaking the state.
The updates to the insertion point are done for use and the AllocaBuilder is always ready to be used. No need to create a builder or update an insertion point (other than the places that create the new alloca insertion points). Note that the suitable structure is the builder, as it is the only interaction with the insertion point, e.g. we do not pass the insertion point to other functions.


> InsetrionPoint vs struct that contains Insertionpoint + other things. But that's beside the point

I disagree. The "other things" is the only interaction we have with the insertion point. So storing or accepting an insertion point is fine for the only purpose of creating a Builder and interacting with the builder. The difference is that we need to add more churn to update and reset the insertion point (what happens behind the scenes with the one line I quoted above).


> Alternatively, pass an allocation point as an argument as suggested earlier.

I did not go this route because I think it complicates the API with little gain. I was also unsure what the flang situation looks like here. Do we have an alloca insertion point there too?
At the end of the day, this makes it easier for the user.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82470/new/

https://reviews.llvm.org/D82470





More information about the llvm-commits mailing list