[cfe-dev] [RFC] Improved address space conversion semantics for targets and language dialects
Anastasia Stulova via cfe-dev
cfe-dev at lists.llvm.org
Fri Mar 8 06:45:08 PST 2019
> The problem with modeling the address space conversion semantics on
> superspaces and subspaces is that these two concepts are orthogonal. A
> target or language could have address spaces which 'overlap' in some
> way, yet disallow implicit or explicit conversion between them. It
> could also have address spaces which do not overlap, but for which
> explicit or implicit conversion is permitted.
I am trying to understand how this could work. I think the current definition
of overlapping in embedded C TC expresses logical overlapping, but not
necessary physical one. My understanding is that if address spaces overlap
logically they can be converted explicitly in both directions and might be
converted implicitly (depending on whether one address space is a superset
of another). Logical overlapping can imply that memory segments physically
overlap but might not. In the latter case it's just a logical concept to simplify
programming or compiler implementation. That's how we are using generic
address space in OpenCL for example, that isn't a physical memory segment.
So I am trying to understand why something that doesn’t overlap (either
logically or physically) would still be convertible? I just found the current logic
with overlapping quite useful in various places for C++ (i.e. overload resolution
where subset is preferable to superset) and it might have wider implications in
case we are to change those.
Also (may be it belongs to a separate discussion though) for C++ specifically
generic address space becomes really key because it's used for implementing
hidden 'this' parameter/expression. I am not quite convinced the current
semantic of it taken from C is sufficient. Because it isn't the same as default
address space where implementation decides to put objects by default but it is
an address space to which every other should be allowed to convert, unless
there is a good reason not to (i.e. logical superset of all or most of the other
address spaces). If there isn't such address space... the application developer
will be forced to write all the implicit operations/methods for each address space
in which a class variable can be declared. It is quite impractical, especially if
there is no special logic needed for different address spaces!
> The method would initially consult any language address space
> conversion rules (such as conversion rules in OpenCL), and if no such
> rules apply, proceed to fall back on a TargetInfo hook.
> The TargetInfo hook would have the same format as the ASTContext
> method, but would return the validity of the conversion for the
> particular compilation target. The default behavior of this hook would
> be that all implicit address space conversions are disallowed, and all
> explicit conversions are permitted.
> (An alternative setup here would be that the ASTContext method queries
> the TargetInfo directly, and have the language semantics be defined in
> the TargetInfo base method instead. This would let targets override
> language semantics. I don't know if this is necessary, or desirable.)
I would vote against target rules ever overriding language ones. This
reduces portability of code among targets which defeats the purpose of
the language mode in my view.
The first idea makes sense to me i.e. use target rules if any of address
spaces is larger than FirstTargetAddressSpace otherwise language rules
should be used.
> * A patch which replaces the currently used methods for address space
> compatibility mentioned earlier (isAddressSpaceSupersetOf,
> isAddressSpaceOverlapping) with calls to the new methods in ASTContext.
> There are some other users of these methods, such as
> Qualifiers::compatiblyIncludes, but it's not entirely clear how to
> update these as a Qualifiers does not have access to ASTContext.
Wondering if it could migrate to ASTContext or it can take ASTContext as
parameter. Although both might cause the layering violations. :(
> * Possibly a patch to remove the old address space compatibility
> methods, if it can be determined that they are no longer needed.
I would prefer to migrate to the new implementation completely
instead of maintaining 2 approaches. We just need to make sure that the
new logic accommodates the old one with the new functionality if needed.
This should work and makes code base more readable and maintainable.
However, if it's not possible to do this directly at the start we can gradually
replace it.
Overall, this work would be a good step forward towards better upstream
support for embedded and heterogeneous devices. One extra thought I have
(again it might belong to a separate RCF) is whether providing some way to
define address space compatibility rules using some sort of syntax in a language
makes sense? The application of this would be (a) portability of code across
different compilers (b) portability of code among accelerators in the same domain
(i.e. ML, Graphics, ...).
Thanks,
Anastasia
From: cfe-dev <cfe-dev-bounces at lists.llvm.org> on behalf of Bevin Hansson via cfe-dev <cfe-dev at lists.llvm.org>
Sent: 06 March 2019 18:11
To: cfe-dev at lists.llvm.org
Subject: [cfe-dev] [RFC] Improved address space conversion semantics for targets and language dialects
== Introduction ==
During the work that Anastasia has been doing on enabling OpenCL C++,
points have been raised about the state of address space support in
Clang. Currently, this support is rather ad-hoc. The representation of
address spaces in qualifiers and the lowering of Clang address spaces
to their LLVM counterparts are sound, but the behavioral semantics of
address spaces given by the Embedded C TR are not really sufficient to
model address space behaviors for arbitrary target architectures.
Here are some of the reviews in which this has come up:
* https://reviews.llvm.org/D58346
* https://reviews.llvm.org/D57464
Many address space semantics are locked behind the OpenCL language
option, even though those semantics would likely be applicable to
non-OpenCL cases as well. This means that, when not using any
particular address space-using language dialect, the address space
semantics are far too loosely defined. When using address spaces
outside of the ones defined in LangAS (the 'target' address spaces),
you can convert between any two address spaces explicitly, even though
this might not make sense on a particular target. There is no way for a
target to define which address spaces are compatible with each other.
Technically, this behavior is in accordance with the Embedded-C TR
(explicitly converting between all address spaces is allowed, but
undefined if they aren't compatible), but I do not believe this
behavior is meaningful. If a target's address spaces are disjoint,
there is no reason to let a user convert between them, even with a
cast.
In order to make the support for address spaces more complete, general
and also useful for targets with a need to define more specific rules
for their address spaces, a generalization of the conversion semantics
for address spaces is needed.
== Current implementation ==
Currently, address space compatibility is defined in terms of
superspaces. An address space can encompass others, in which case it
would be considered a superset/superspace of the other address spaces.
Given two address spaces, Super and Sub, where Super is a superspace of
Sub, then it is valid to implicitly convert a `Sub T*` to a `Super T*`,
as all pointers to Sub are encompassed by pointers to Super. It is not
necessarily safe to implicitly convert in the other direction. Also, an
address space is a superspace of itself.
This is currently implemented in Qualifiers::isAddressSpaceSupersetOf.
The OpenCL __generic address space is the superspace of all other
address spaces except for the OpenCL __constant address space. This
method is used when checking pointer compatibility during assignment
(and other forms of initialization).
Explicitly converting (casting) between two address spaces is permitted
if either of them is a superspace of the other. This is implemented in
Qualifiers::isAddressSpaceOverlapping. This check is only done in
OpenCL mode; when using address spaces in regular C, explicit
conversion is always permitted.
== Issues ==
The problem with modeling the address space conversion semantics on
superspaces and subspaces is that these two concepts are orthogonal. A
target or language could have address spaces which 'overlap' in some
way, yet disallow implicit or explicit conversion between them. It
could also have address spaces which do not overlap, but for which
explicit or implicit conversion is permitted.
== Suggestion ==
The suggestion in this RFC to improve the way targets and languages in
Clang can express the semantics of address space conversions is to add
a mechanism to ASTContext and TargetInfo which lets us query if a
conversion from one address space to another address space is either:
* invalid
* valid implicitly
* valid explicitly
A suggestion for the interface on ASTContext would be
bool isAddressSpaceConvertible(LangAS From, LangAS To, bool Explicit)
The method would initially consult any language address space
conversion rules (such as conversion rules in OpenCL), and if no such
rules apply, proceed to fall back on a TargetInfo hook.
The TargetInfo hook would have the same format as the ASTContext
method, but would return the validity of the conversion for the
particular compilation target. The default behavior of this hook would
be that all implicit address space conversions are disallowed, and all
explicit conversions are permitted.
(An alternative setup here would be that the ASTContext method queries
the TargetInfo directly, and have the language semantics be defined in
the TargetInfo base method instead. This would let targets override
language semantics. I don't know if this is necessary, or desirable.)
It's important to point out that implicit validity should imply the
explicit one. If a call to this ASTContext method is made as below, and
returns true:
Ctx.isAddressSpaceConvertible(From, To, false)
then the following should also return true:
Ctx.isAddressSpaceConvertible(From, To, true)
If it did not, then `To T* p = from_ptr` would be permitted, but
`To T* p = (To T*)from_ptr` would not be, which is rather
counterintuitive.
== Necessary work ==
The steps to implement this RFC should be as follows:
* A patch which adds the aforementioned methods to ASTContext and
TargetInfo, and defines the necessary semantics for the default and
language-specific conversions in them.
* A patch which replaces the currently used methods for address space
compatibility mentioned earlier (isAddressSpaceSupersetOf,
isAddressSpaceOverlapping) with calls to the new methods in ASTContext.
There are some other users of these methods, such as
Qualifiers::compatiblyIncludes, but it's not entirely clear how to
update these as a Qualifiers does not have access to ASTContext.
* Possibly a patch to remove the old address space compatibility
methods, if it can be determined that they are no longer needed.
Thank you for reading!
More information about the cfe-dev
mailing list