[PATCH] D70172: [CUDA][HIP][OpenMP] Emit deferred diagnostics by a post-parsing AST travese

Yaxun Liu via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Feb 20 08:57:37 PST 2020


yaxunl marked an inline comment as done.
yaxunl added inline comments.


================
Comment at: clang/lib/Sema/Sema.cpp:1514
+  void visitUsedDecl(SourceLocation Loc, Decl *D) {
+    if (auto *TD = dyn_cast<TranslationUnitDecl>(D)) {
+      for (auto *DD : TD->decls()) {
----------------
rjmccall wrote:
> yaxunl wrote:
> > rjmccall wrote:
> > > erichkeane wrote:
> > > > rjmccall wrote:
> > > > > erichkeane wrote:
> > > > > > Note that when recommitting this (if you choose to), this needs to also handle NamespaceDecl.  We're a downstream and discovered that this doesn't properly handle functions or records handled in a namespace.
> > > > > > 
> > > > > > It can be implemented identically to TranslationUnitDecl.
> > > > > Wait, what?  We shouldn't be doing this for TranslationUnitDecl either.   I don't even know how we're "using" a TranslationUnitDecl, but neither this case not the case for `NamespaceDecl` should be recursively using every declaration declared inside it.  If there's a declaration in a namespace that's being used, it should be getting visited as part of the actual use of it.
> > > > > 
> > > > > The logic for `RecordDecl` has the same problem.  
> > > > Despite the name, this seems to be more of a home-written ast walking class.  The entry point is the 'translation unit' which seems to walk through everything in an attempt to find all the functions (including those that are 'marked' as used by an attribute).
> > > > 
> > > > You'll see the FunctionDecl section makes this assumption as well (not necessarily that we got to a function via a call). IMO, this approach is strange, and we should register entry points in some manner (functions marked as emitted to the device in some fashion), then just follow its call-graph (via the clang::CallGraph?) to emit all of these functions.
> > > > 
> > > > It seemed really odd to see this approach here, but it seemed well reviewed by the time I noticed it (via a downstream bug) so I figured I'd lost my chance to disagree with the approach.
> > > > 
> > > > 
> > > Sure, but `visitUsedDecl` isn't the right place to be entering the walk.  `visitUsedDecl` is supposed to be the *callback* from the walk.  If they need to walk all the global declarations to find kernels instead of tracking the kernels as they're encountered (which would be a *much* better approach), it should be done as a separate function.
> > > 
> > > I just missed this in the review.
> > The deferred diagnostics could be initiated by non-kernel functions or even host functions.
> > 
> > Let's consider a device code library where no kernels are defined. A device function is emitted, which calls a host device function which has a deferred diagnostic. All device functions that are emitted need to be checked.
> > 
> > Same with host functions that are emitted, which may call a host device function which has deferred diagnostic.
> > 
> > Also not just function calls need to be checked. A function address may be taken then called through function pointer. Therefore any reference to a function needs to be followed.
> > 
> > In the case of OpenMP, the initialization of a global function pointer which refers a function may trigger a deferred diangostic. There are tests for that.
> Right, I get that emitting deferred diagnostics for a declaration D needs to trigger any deferred diagnostics in declarations used by D, recursively.  You essentially have a graph of lazily-emitted declarations (which may or may not have deferred diagnostics) and a number of eagerly-emitted "root" declarations with use-edges leading into that graph.  Any declaration that's reachable from a root will need to be emitted and so needs to have any deferred diagnostics emitted as well.  My question is why you're finding these roots with a retroactive walk of the entire translation unit instead of either building a list of roots as you go or (better yet) building a list of lazily-emitted declarations that are used by those roots.  You can unambiguously identify at the point of declaration whether an entity will be eagerly or lazily emitted, right?  If you just store those initial edges into the lazily-emitted declarations graph and then initiate the recursive walk from them at the end of the translation unit, you'll only end up walking declarations that are actually relevant to your compilation, so you'll have much better locality and (if this matters to you) you'll naturally work a lot better with PCH and modules.
I will try the approach you suggested. Basically I will record the emitted functions and variables during parsing and use them as starting point for the final traversal.

This should work for CUDA/HIP. However it may be tricky for OpenMP since the emission of some entities depending on pragmas. Still it may be doable. If I encounter difficulty I will come back for discussion.

I will post the change for review.

Thanks.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70172/new/

https://reviews.llvm.org/D70172





More information about the cfe-commits mailing list