[PATCH] D18110: [OpenMP] Fix SEMA bug in the capture of global variables in template functions.

Sun Apr 3 23:44:30 PDT 2016

ABataev added inline comments.

================
Comment at: lib/Sema/SemaOpenMP.cpp:816-822
@@ -801,6 +815,9 @@
+
+  // A DSA refers to this captured region if the parent contexts match.
+  auto *ParentContext = RSI->TheCapturedDecl->getParent();
   for (auto I = Stack.rbegin(), EE = Stack.rend(); I != EE; ++I)
-    if (I->CurScope == S)
+    if (I->ParentDeclContext == ParentContext)
       return I->Directive;
   return OMPD_unknown;
 }
 
----------------
sfantao wrote:
> ABataev wrote:
> > sfantao wrote:
> > > ABataev wrote:
> > > > Actually, I think, we need to rework the whole IsOpenMPCapturedByRef() function. I'm not sure that it is even required. It looks like some optimization hack in the frontend. I'm against any not-necessary optimizations in frontend. If scalar value is a firstprivate, it must be handled in codegen, not by handling it by copy.
> > > 'IsOpenMPCapturedByRef' goal is to change the signature of the outlined function. We don't want to have reference types in target region arguments unless they are really required. We have seen performance being greatly affected just because of that. Of course a consequence of this is to have variables that become first private.
> > > 
> > > The current implementation of OpenMP firstprivate doesn't change the the function signatures, only the codegen inside the region is affected. So, this is not a complete overlap.
> > > 
> > > With this, I am not saying that this cannot be done in codegen. The reason I am doing it here is your initial guideline that we should attempt to fix most of the things in Sema and avoid complicating codegen which is a more sensitive part.
> > > 
> > > Doing this in Codegen would require changing the implementation of `GenerateOpenMPCapturedVars` and `GenerateOpenMPCapturedStmtFunction`. We wouldn't need the tracking of the context anymore (checking the directive would be sufficient), but we would still need to see what't in the map clauses.
> > > 
> > > So, do you think we should change this?
> > I think we should not dot it at all for now. Performance is an important thing, of course, but at first we must have just working solution and only after that we will think about performance.
> > Early optimization leads to many troubles. The code becomes very hard to understand. It just increases the time of the development.
> > I suggest to remove all optimizations for now and emit the code as is, before we have a working solution.
> > Also, this function may lead to incorrect codegen. You don't check that CapturedRegion is actually an OpenMP region. If we have non-OpenMP cpatured region between OpenMP regions it will lead to incorrect code.
> I don't agree. As you know, several components on the codegen depend on how things are captured. So in my view, identifying the best possible implementation (including performance-wise) as early as possible is in our best interest to avoid major refactoring down the road.
> 
> All the offloading implementation that we have working today in the trunk is relying on this already. So, by rolling back to capture by reference everything, will require to refactor all the offloading code/patches/regression tests just to have to redo them again in the near future because of the issues we have identified already. 
> 
> I'd rather spend that effort to have the capture by copy to work in a acceptable way and I can move this logic to codegen if you think that is the best way to do it. Note that will require checking the expressions of the map clause in `GenerateOpenMPCapturedVars` and `GenerateOpenMPCapturedStmtFunction`, something that Sema will have to do anyway.
> 
> Let me know your thoughts. 
> 
> 
I don't like this series of patches because they are very intrusive. They add a lot of code, that is very hard to read and understand, very hard to refactor. That's why I'd prefer to see a simple solution at first and only after that try to optimize something.
We had the same problems with the performance of non-offloading part. But the very first thing that must be implemented - functionality. Optmization, performance etc. is the second step. It is better to add optimization only after the basic functionality is implemented.
So, yes, I think you must remove all optimizations for now and start from basic implementation.
I don't like the idea of adding new fields. tryCaptureVariable() function has variable OpenMPLevel. I think it can be used for proper syncing between CapturedRegionInfo and OpenMP region. But I'd prefer to see all these stuff only after the main part of codegen for target directive is implemented


http://reviews.llvm.org/D18110