[PATCH] D70172: [CUDA][HIP][OpenMP] Emit deferred diagnostics by a post-parsing AST travese

Wed Feb 19 15:33:05 PST 2020

rjmccall added inline comments.

================
Comment at: clang/lib/Sema/Sema.cpp:1514
+  void visitUsedDecl(SourceLocation Loc, Decl *D) {
+    if (auto *TD = dyn_cast<TranslationUnitDecl>(D)) {
+      for (auto *DD : TD->decls()) {
----------------
yaxunl wrote:
> rjmccall wrote:
> > erichkeane wrote:
> > > rjmccall wrote:
> > > > erichkeane wrote:
> > > > > Note that when recommitting this (if you choose to), this needs to also handle NamespaceDecl.  We're a downstream and discovered that this doesn't properly handle functions or records handled in a namespace.
> > > > > 
> > > > > It can be implemented identically to TranslationUnitDecl.
> > > > Wait, what?  We shouldn't be doing this for TranslationUnitDecl either.   I don't even know how we're "using" a TranslationUnitDecl, but neither this case not the case for `NamespaceDecl` should be recursively using every declaration declared inside it.  If there's a declaration in a namespace that's being used, it should be getting visited as part of the actual use of it.
> > > > 
> > > > The logic for `RecordDecl` has the same problem.  
> > > Despite the name, this seems to be more of a home-written ast walking class.  The entry point is the 'translation unit' which seems to walk through everything in an attempt to find all the functions (including those that are 'marked' as used by an attribute).
> > > 
> > > You'll see the FunctionDecl section makes this assumption as well (not necessarily that we got to a function via a call). IMO, this approach is strange, and we should register entry points in some manner (functions marked as emitted to the device in some fashion), then just follow its call-graph (via the clang::CallGraph?) to emit all of these functions.
> > > 
> > > It seemed really odd to see this approach here, but it seemed well reviewed by the time I noticed it (via a downstream bug) so I figured I'd lost my chance to disagree with the approach.
> > > 
> > > 
> > Sure, but `visitUsedDecl` isn't the right place to be entering the walk.  `visitUsedDecl` is supposed to be the *callback* from the walk.  If they need to walk all the global declarations to find kernels instead of tracking the kernels as they're encountered (which would be a *much* better approach), it should be done as a separate function.
> > 
> > I just missed this in the review.
> The deferred diagnostics could be initiated by non-kernel functions or even host functions.
> 
> Let's consider a device code library where no kernels are defined. A device function is emitted, which calls a host device function which has a deferred diagnostic. All device functions that are emitted need to be checked.
> 
> Same with host functions that are emitted, which may call a host device function which has deferred diagnostic.
> 
> Also not just function calls need to be checked. A function address may be taken then called through function pointer. Therefore any reference to a function needs to be followed.
> 
> In the case of OpenMP, the initialization of a global function pointer which refers a function may trigger a deferred diangostic. There are tests for that.
Right, I get that emitting deferred diagnostics for a declaration D needs to trigger any deferred diagnostics in declarations used by D, recursively.  You essentially have a graph of lazily-emitted declarations (which may or may not have deferred diagnostics) and a number of eagerly-emitted "root" declarations with use-edges leading into that graph.  Any declaration that's reachable from a root will need to be emitted and so needs to have any deferred diagnostics emitted as well.  My question is why you're finding these roots with a retroactive walk of the entire translation unit instead of either building a list of roots as you go or (better yet) building a list of lazily-emitted declarations that are used by those roots.  You can unambiguously identify at the point of declaration whether an entity will be eagerly or lazily emitted, right?  If you just store those initial edges into the lazily-emitted declarations graph and then initiate the recursive walk from them at the end of the translation unit, you'll only end up walking declarations that are actually relevant to your compilation, so you'll have much better locality and (if this matters to you) you'll naturally work a lot better with PCH and modules.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70172/new/

https://reviews.llvm.org/D70172