[PATCH] D39505: [OpenMP] Show error if VLAs are not supported

Fri Nov 17 09:40:22 PST 2017

Hahnfeld marked an inline comment as done.
Hahnfeld added inline comments.

================
Comment at: include/clang/Basic/TargetInfo.h:944
+  /// \brief Whether target supports variable-length arrays.
+  bool isVLASupported() const { return VLASupported; }
+
----------------
rjmccall wrote:
> Hahnfeld wrote:
> > rjmccall wrote:
> > > ABataev wrote:
> > > > Hahnfeld wrote:
> > > > > rjmccall wrote:
> > > > > > Hahnfeld wrote:
> > > > > > > rjmccall wrote:
> > > > > > > > Hahnfeld wrote:
> > > > > > > > > rjmccall wrote:
> > > > > > > > > > ABataev wrote:
> > > > > > > > > > > ABataev wrote:
> > > > > > > > > > > > Hahnfeld wrote:
> > > > > > > > > > > > > rjmccall wrote:
> > > > > > > > > > > > > > Hahnfeld wrote:
> > > > > > > > > > > > > > > rjmccall wrote:
> > > > > > > > > > > > > > > > Hahnfeld wrote:
> > > > > > > > > > > > > > > > > rjmccall wrote:
> > > > > > > > > > > > > > > > > > The way you've written this makes it sound like "does the target support VLAs?", but the actual semantic checks treat it as "do OpenMP devices on this target support VLAs?"  Maybe there should be a more specific way to query things about OpenMP devices instead of setting a global flag for the target?
> > > > > > > > > > > > > > > > > Actually, the NVPTX and SPIR targets never support VLAs. So I felt like it would be more correct to make this a global property of the target.
> > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > The difference is that the other programming models (OpenCL and CUDA) error out immediatelyand regardless of the target because this limitation is reflected in the standards that disallow VLAs (see SemaType.cpp). For OpenMP we might have target devices that support VLA so we shouldn't error out for those.
> > > > > > > > > > > > > > > > If you want to make it a global property of the target, that's fine, but then I don't understand why your diagnostic only fires when (S.isInOpenMPDeclareTargetContext() || S.isInOpenMPTargetExecutionDirective()).
> > > > > > > > > > > > > > > That is because of how OpenMP offloading works and how it is implemented in Clang. Consider the following snippet from the added test case:
> > > > > > > > > > > > > > > ```lang=c
> > > > > > > > > > > > > > > int vla[arg];
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > #pragma omp target map(vla[0:arg])
> > > > > > > > > > > > > > > {
> > > > > > > > > > > > > > >    // more code here...
> > > > > > > > > > > > > > > }
> > > > > > > > > > > > > > > ```
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > Clang will take the following steps to compile this into a working binary for a GPU:
> > > > > > > > > > > > > > > 1. Parse and (semantically) analyze the code as-is for the host and produce LLVM Bitcode.
> > > > > > > > > > > > > > > 2. Parse and analyze again the code as-is and generate code for the offloading target, the GPU in this case.
> > > > > > > > > > > > > > > 3. Take LLVM Bitcode from 1., generate host binary and embed target binary from 3.
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > `OpenMPIsDevice` will be true for 2., but the complete source code is analyzed. So to not throw errors for the host code, we have to make sure that we are actually generating code for the target device. This is either in a `target` directive or in a `declare target` region.
> > > > > > > > > > > > > > > Note that this is quite similar to what CUDA does, only they have `CUDADiagIfDeviceCode` for this logic. If you want me to add something of that kind for OpenMP target devices, I'm fine with that. However for the given case, it's a bit different because this error should only be thrown for target devices that don't support VLAs...
> > > > > > > > > > > > > > I see.  So the entire translation unit is re-parsed and re-Sema'ed from scratch for the target?  Which means you need to avoid generating errors about things in the outer translation unit that aren't part of the target directive that you actually want to compile.  I would've expected there to be some existing mechanism for that, to be honest, as opposed to explicitly trying to suppress target-specific diagnostics one by one.
> > > > > > > > > > > > > Yes, that is my understanding. For errors, we don't need to take anything special as the first `cc1` invocation will exit with a non-zero status so that the driver stops the compilation. For warnings, there seems to be no mechanism in place as I see them duplicated, even in code that is not generate for the target device (verified with an unused variable).
> > > > > > > > > > > > > 
> > > > > > > > > > > > > @ABataev @gtbercea Do I miss something here?
> > > > > > > > > > > > I'm not aware of any.
> > > > > > > > > > > John, target-specific checks require some special flags (like LangOpts.Cuda) that are not set when we re-compile the code for OpenMP devices. That's why errors are not emitted for the non-target code. But also because of that, we need some special OpenMP checks for target-specific code inside the target regions. For example, code in lib/Sema/SemaType.cpp, lines 2184, 2185 (see this file in this patch) checks for Cuda compilation and prohibits using of VLAs in Cuda mode. We also should prohibit using of VLAs in target code for NVPTX devices or other devices that do not support VLAs in OpenMP mode.
> > > > > > > > > > I think it would be cleaner here, and better for our OpenMP support overall, if we found a more general way to suppress unwanted diagnostics in the second invocation for code outside of the target directive.  This check (and several others) would then just implement a more general target feature disabling VLA support instead of being awkwardly OpenMP-specific.
> > > > > > > > > I think to get this we would need to make `Diag` a no-op `if (Context.getLangOpts().OpenMPIsDevice && !(isInOpenMPDeclareTargetContext() || isInOpenMPTargetExecutionDirective()))`. This would ignore all diagnostics outside of the code is really generated in the end...
> > > > > > > > I mean, the danger of this approach is that you don't really want to suppress diagnostics for top-level declarations: it can leave you with an invalid AST, and it is not valid to generate IR from an invalid AST.
> > > > > > > > 
> > > > > > > > Sorry for the ignorant questions that follow, but I assume the OpenMP spec must bless this double-translation somehow, and I'd like to understand more about that in order to advise how to proceed.  How does OpenMP handle the possibility that the code will be processed substantially differently for different targets?  Is there some rule in the spec saying that the code has to expand "the same" in both targets?  How does that work when e.g. size_t might have a different size or use a completely different type?  More generally, how do expect that this feature will work in the more complicated language modes, like OpenMP + C++?
> > > > > > > So if there is an error, the analysis will already fail on the host. I think that guarantees that we don't end up with an invalid AST and will at most suppress duplicate warnings.
> > > > > > > 
> > > > > > > Regarding the OpenMP spec: I think the unsatisfying answer is that the spec doesn't say what it expects on that questions. So I think the compiler has to do what seems reasonable...
> > > > > > Well, what I'm worried about is the possibility that something changes about the translation unit when it's reprocessed for the target — e.g. there's a target-dependent #if that causes an error in the target TU, but not in the original TU, so that the suppressed error is the only reason that the build fails.
> > > > > > 
> > > > > > If the spec is unclear about this, then we just have to muddle through.  Is this "reparse the whole translation unit for the target" the prevailing implementation technique for target directives?
> > > > > If we are worried about that scenario we have the preserve the current state: Do nothing, diagnose everything and let the user figure out if there is an error in the code.
> > > > > 
> > > > > I can't really comment on what other compilers (GCC, Intel) do, but at least for GCC you compile a complete compiler for the target, so I suppose they kind of do the same...
> > > > John, reparsing is required to compile the target-specific code for the particular device. This is an intended behavior. Otherwise we may use some host-specific code on the device. Reparsing allows to reinclude all includes so that the code uses all the definitions for the target instead of those used for the host.
> > > > Intel and gcc are doing the same
> > > Okay, thanks, that's all I was asking.
> > > 
> > > I am still concerned that this is going to gradually grow to encompass basically every "XX feature is disabled on this target" diagnostic.  We could solve this more elegantly by tagging such diagnostics in the .td file and then introducing a diagnostic filter that dropped such diagnostics except when we're inside a target directive.
> > > 
> > > Even if you don't do that, I'm going to insist that you rewrite the test in SemaType so that it actually fires in normal code if the target disables VLAs.  The OpenMP-specific logic should be to suppress the diagnostic if we happen to be in target mode and not within a target directive.
> > > 
> > > Also, it seems to me that the "are we inside a target directive" check is inadequate to deal with lazy-code-emission features like static functions in C and templates in C++.  The diagnostic must be emitted for a violation in a function body if and only if the function is used from within the target directive.  That is, unless there's some very coarse language restriction like forbidding external uses in target directives?
> > > 
> > > Also, shouldn't the "are we in a target directive" be checking that we're in a *matching* target directive?  Surely a single OpenMP file can contain target directives for multiple external targets?
> > Ok, so going forward you suggest something like the following?
> > ```lang=C++
> >   // Some targets don't support VLAs.
> >   if (!Context.getTargetInfo().isVLASupported() && T->isVariableArrayType() &&
> >       (!getLangOpts().OpenMPIsDevice || isInOpenMPDeclareTargetContext() ||
> >       isInOpenMPTargetExecutionDirective())) {
> >     Diag(Loc, diag::err_target_no_vla_support);
> >     return QualType();
> >   }
> > ```
> > 
> > Doesn't lazy-code emission check the code semantically? Yes, the OpenMP standard (currently) mandates that all external functions that are called must be in a `declare target` region. And all regions in a file are compiled for all targets, so there shouldn't be a "non-matching" target directive.
> I would check for a VAT before checking whether VLAs are supported, but yes, that's the right idea.  Please also consider just adding this check to where we build the VAT in the first place.
> 
> We don't (typically) do any extra semantic checks on a function body when it's first used, but if the standard requires all functions that are called to be in a target region, that should be good enough.  And ok, I didn't understand that about targets; sounds good.
I think I've implemented what we discussed here, please let me know if you disagree. I've also tried to add tests with templates, but `declare target` is currently broken for that use case, see https://bugs.llvm.org/show_bug.cgi?id=35348.

https://reviews.llvm.org/D39505