[Openmp-commits] [PATCH] D82470: [OpenMP][IRBuilder] Support allocas in nested parallel regions

Fady Ghanim via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Mon Jun 29 01:34:36 PDT 2020


fghanim requested changes to this revision.
fghanim added inline comments.
This revision now requires changes to proceed.


================
Comment at: llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h:282
+  IRBuilder<> AllocaBuilder;
+
   /// Map to remember source location strings
----------------
jdoerfert wrote:
> fghanim wrote:
> > jdoerfert wrote:
> > > fghanim wrote:
> > > > What's the benefit of this over just maintaining an alloca insertion point?
> > > > 
> > > > With this we will have 3 `IRBuilder`s to maintain and keep track of: the clang (or flang) `IRBuilder`, the OMP `IRBuilder` and the `allocaBuilder`
> > > No functional difference. Basically the same reason why clang has an AllocIRBuilder (I think), we can use it independently of the other one. The alternative is to save the builder IP, set it to the alloca IP, do alloca stuff, reset the builder IP, when you use the AllocaIRBuilder below. Again, no functional difference, just less setting/resetting of IPs.
> > I understand where you are coming from, and I understand that functionally there is mostly no change - for now. But I prefer we do a small struct with some RAII, over adding an entire builder. I think it's a bit overkill and will be a source to lots of trouble.
> > 
> > Clang maintains an instruction `AllocaInsertPt`, not a specific builder. Speaking of which, This is completely from memory (i.e. check to make sure); I think we already save the previous `AllocaInsertionPt` in clang as part of the `bodyGenCB`. In which case, this is not needed; a nested `parallel` will always be generated by clang as part of the `bodyGenCB`, which in turn is going to save the Alloca insertion point for us. 
> > To enable the creation of the allocas for `TID` and `ZeroAddr` in the outer function, Why not pass current `AllocaInsertionPt` as an arg. to `createParallel`
> > I understand where you are coming from, and I understand that functionally there is mostly no change - for now. But I prefer we do a small struct with some RAII, over adding an entire builder. I think it's a bit overkill and will be a source to lots of trouble.
> 
> Could you explain what kind of overkill you see here? And maybe also the lots of trouble you expect from a builder versus a insertion point? It is the same information, just packed in a different struct, right?
> 
> 
> > Clang maintains an instruction AllocaInsertPt, not a specific builder. 
> 
> Right.
> 
> 
> > Speaking of which, This is completely from memory (i.e. check to make sure); I think we already save the previous AllocaInsertionPt in clang as part of the bodyGenCB. 
> 
> I think so.
> 
> 
> > In which case, this is not needed; 
> 
> Not strictly, no. 
> 
> > a nested parallel will always be generated by clang as part of the bodyGenCB,
> 
> Yes.
> 
> > which in turn is going to save the Alloca insertion point for us.
> 
> Yes.
> 
> > To enable the creation of the allocas for TID and ZeroAddr in the outer function, Why not pass current AllocaInsertionPt as an arg. to createParallel
> 
> So far, because this will not be the only one that needs to create allocas. If you want to add the allocaIP to all runtime calls that create allocas, I'm fine with that too.
> Could you explain what kind of overkill you see here?
Using an IRBuilder when an insertion point suffices

> And maybe also the lots of trouble you expect from a builder versus a insertion point?
Currently, when using the OMPBuilder you need to juggle two IRBuilders, let's not make them 3 :)

> It is the same information, just packed in a different struct, right?
no it is not. `InsetrionPoint` vs  struct that contains `Insertionpoint` + other things. But that's beside the point

I mean, is the `AllocaBuilder` providing a new functionality other than a convenient way to keep track of the Alloca `InsertionPoint`?
Personally, I prefer some suitable structure to your needs and a stack (i.e. the way clang does it) . This should also resolve the todo below. Alternatively, pass an allocation point as an argument as suggested earlier. I am open to any third way if you have any.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82470/new/

https://reviews.llvm.org/D82470





More information about the Openmp-commits mailing list