<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=Windows-1252">
<style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>
</head>
<body dir="ltr">
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Hi Richard,</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
I appreciate for your effort with this RFC.</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<pre><span style="font-size: 10pt;">> Using intrinsics might seem old-fashioned when there are various
frameworks that express data-parallel algorithms in a more abstract way,
or libraries like P0214 (std::simd) that provide mostly performance-
portable vector interfaces. But in practice, each vector architecture
has its own quirks and unique features that aren't easy for the compiler
to use automatically and aren't performance-portable enough to be part
of a generic interface. So even though target-neutral approaches are a
very welcome development, they're not a complete solution. Intrinsics
are still vital when you really want to hand-optimise a routine for a
particular architecture. And that's still a common requirement.
For example, Arm has been porting various codebases that already support
AArch64 AdvSIMD intrinsics to SVE2. Even though AdvSIMD and SVE2 have
some features in common, the routines for the two architectures are
often significantly different from each other (and in ways that can't be
abstracted by interfaces like std::simd). We need to have direct access
to SVE2 features for this kind of work.</span></pre>
I am +1000 with using intrinsic functions. Internally, there was discussion about supporting this type. For instance, how we can implement vector swizzle like ".xyz" or "hi/lo"? At this moment, CLANG uses shuffle vector to implement it. I guess we would want
to swizzle vector per vector unit which is unknown at compile time. I am not sure we can implement it efficiently with current LLVM's IR vector operations. We could miss instruction combine or other optimization opportunities. However, I guess it would not
be easy for the passes to handle this type's operations. If I missed something, please let me know.</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Thanks,</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
JinGu Kang</div>
<div>
<div id="appendonsend"></div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
<br>
</div>
<hr tabindex="-1" style="display:inline-block; width:98%">
<div id="divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" color="#000000" style="font-size:11pt"><b>From:</b> cfe-dev <cfe-dev-bounces@lists.llvm.org> on behalf of Richard Sandiford via cfe-dev <cfe-dev@lists.llvm.org><br>
<b>Sent:</b> 06 June 2019 16:55<br>
<b>To:</b> cfe-dev@lists.llvm.org<br>
<b>Subject:</b> [cfe-dev] RFC: Adding vscale vector types to C and C++</font>
<div> </div>
</div>
<div class="BodyFragment"><font size="2"><span style="font-size:11pt">
<div class="PlainText">LLVM now supports a "scalable" vector type:<br>
<br>
<vscale x N x ELT> (e.g. <vscale x 4 x i32>)<br>
<br>
that represents a vector of X*N ELTs for some runtime value X<br>
[<a href="https://reviews.llvm.org/D32530">https://reviews.llvm.org/D32530</a>]. The number of elements is therefore<br>
not known at compile time and can depend on choices made by the execution<br>
environment. This RFC is about how we can provide C and C++ types that<br>
map to this LLVM type.<br>
<br>
The main complication is that, because the number of elements isn't<br>
known at compile time, "sizeof" can't work in the same way as it does<br>
for normal vector types. Our suggested fix for this is to separate the<br>
concept of "complete type" into two:<br>
<br>
* does the type have enough information to construct objects of that type?<br>
<br>
For want of a better term, types that have this property are<br>
"definite" while types that don't are "indefinite".<br>
<br>
* will it be possible to measure the size of the type using "sizeof",<br>
once the type is definite?<br>
<br>
If so, the type is "sized", otherwise it is "sizeless".<br>
<br>
"Complete" is then equivalent to "sized and definite". The new scalable<br>
vectors are definite but sizeless, and so are never complete.<br>
<br>
We can then redefine certain rules to use the distinction between<br>
definite and indefinite types rather than complete and incomplete types.<br>
(This is a simple change to make in Clang.) Things like "sizeof" and<br>
pointer arithmetic continue to require complete types, and so are invalid<br>
for the new types. See below for a more detailed description.<br>
<br>
We're also proposing to treat the new C and C++ types as opaque built-in<br>
types rather than first-class vector types, for two reasons:<br>
<br>
(1) It means that we don't need to define what the "vscale" is for<br>
all targets, or emulate general vscale operations for all targets.<br>
We can just provide the types that the target supports natively,<br>
and for which the target already has a defined ABI.<br>
<br>
(2) It allows for more abstraction. For example, SVE has scalable types<br>
that are logically tuples of 2, 3 or 4 vectors. Defining them as opaque<br>
built-in types means that we don't need to treat them as single vectors<br>
in C and C++, even if that happens to be how LLVM represents them.<br>
Building tuple types into the compiler also means that we don't need<br>
to support scalable vectors in structures or arrays.<br>
<br>
In case this looks familiar...<br>
==============================<br>
<br>
This is a refresh of an RFC I sent out last year<br>
[<a href="http://lists.llvm.org/pipermail/cfe-dev/2018-May/057830.html">http://lists.llvm.org/pipermail/cfe-dev/2018-May/057830.html</a>].<br>
The details are basically the same, except that we're no longer<br>
proposing to support user-defined sizeless types. The reason for<br>
sending the RFC again is that (unlike last time) LLVM does now support<br>
the underlying scalable vectors. The patches are therefore less<br>
speculative than they were before.<br>
<br>
Those on WG21 might also remember that sizeless types were used as<br>
a possible basis for a proposal to make P0214 support scalable vectors<br>
[<a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p1101r0.html">http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p1101r0.html</a>].<br>
It was clear from the committee meeting that modifying P0214 in this<br>
way wasn't acceptable and this message isn't an attempt to revive<br>
that discussion. All we're trying to do with this RFC is make<br>
Clang support opaque built-in types that map to LLVM vscale types.<br>
(In particular, there's no __sizeless_struct, or any other attempt<br>
to support aggregates of sizeless types.)<br>
<br>
Why the extension is needed<br>
===========================<br>
<br>
We need these scalable types in the AArch64 port so that we can provide<br>
low-level access to the SVE and SVE2 vector extensions. More information<br>
on the extensions is available here:<br>
<br>
<a href="https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/new-technologies-for-the-arm-a-profile-architecture">
https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/new-technologies-for-the-arm-a-profile-architecture</a><br>
<br>
but the only feature that really matters for this RFC is that they have<br>
no fixed or preferred vector length. Processors that implement SVE can<br>
instead choose from a range of possible vector lengths. This means that<br>
in many environments, the actual vector length is only known at runtime.<br>
<br>
SVE has been designed so that one piece of "length-agnostic" code can<br>
work for all vector lengths. The new scalable types provide the basis<br>
for writing such code in C and C++. Specifically:<br>
<br>
* As with other vector architectures, it's possible to pass and return<br>
vectors in registers when calling other functions. This is particularly<br>
useful for things like vector libm routines. We need a C and C++<br>
representation of the vector types in order to write such functions.<br>
<br>
* Again as for other vector architectures, we have a set of intrinsic<br>
functions that provide low-level access to the architecture, as a<br>
last line of defence before dropping to assembly. This again needs<br>
scalable types that can hold temporary working data and that can be<br>
passed to and returned from intrinsic functions.<br>
<br>
Using intrinsics might seem old-fashioned when there are various<br>
frameworks that express data-parallel algorithms in a more abstract way,<br>
or libraries like P0214 (std::simd) that provide mostly performance-<br>
portable vector interfaces. But in practice, each vector architecture<br>
has its own quirks and unique features that aren't easy for the compiler<br>
to use automatically and aren't performance-portable enough to be part<br>
of a generic interface. So even though target-neutral approaches are a<br>
very welcome development, they're not a complete solution. Intrinsics<br>
are still vital when you really want to hand-optimise a routine for a<br>
particular architecture. And that's still a common requirement.<br>
<br>
For example, Arm has been porting various codebases that already support<br>
AArch64 AdvSIMD intrinsics to SVE2. Even though AdvSIMD and SVE2 have<br>
some features in common, the routines for the two architectures are<br>
often significantly different from each other (and in ways that can't be<br>
abstracted by interfaces like std::simd). We need to have direct access<br>
to SVE2 features for this kind of work.<br>
<br>
Implementation<br>
==============<br>
<br>
I've uploaded a Clang implementation to Phabricator. There are three parts:<br>
<br>
<a href="https://reviews.llvm.org/D62960">https://reviews.llvm.org/D62960</a><br>
<br>
Adds some SVE types that can be used to test the next two patches.<br>
This is a respin of Graham's patch [<a href="https://reviews.llvm.org/D59245">https://reviews.llvm.org/D59245</a>]<br>
with some minor updates.<br>
<br>
The patch isn't really part of the RFC, but if you have any<br>
comments about defining the types this way, please let us know!<br>
<br>
<a href="https://reviews.llvm.org/D62961">https://reviews.llvm.org/D62961</a><br>
<br>
Adds new type queries isSizeless and isIndefinite.<br>
<br>
<a href="https://reviews.llvm.org/D62962">https://reviews.llvm.org/D62962</a><br>
<br>
The Clang support itself, including documentation and testcases.<br>
<br>
Criteria for clang extensions<br>
=============================<br>
<br>
>From the list on [<a href="http://clang.llvm.org/get_involved.html">http://clang.llvm.org/get_involved.html</a>],<br>
an extension needs:<br>
<br>
(1) Evidence of a significant user community<br>
<br>
The extension allows SVE intrinsics to be used in places that<br>
currently use intrinsics for other vector architectures. There is<br>
already one public project that uses the SVE intrinsics[1] and one<br>
that specifically considered SVE support as part of its design<br>
philosophy[2]. Arm has patches to add SVE and SVE2 support to<br>
several other projects, but they're gated on the Clang support.<br>
<br>
[1] <a href="https://github.com/nmeyer-ur/Grid">https://github.com/nmeyer-ur/Grid</a><br>
[2] <a href="https://github.com/google/pik/tree/master/pik/simd">https://github.com/google/pik/tree/master/pik/simd</a><br>
<br>
(2) A specific need to reside within the Clang tree<br>
<br>
The extension involves (small) changes to the core type system.<br>
It's also part of supporting target-specific intrinsics, which<br>
would normally be part of Clang even without the scalable type<br>
aspect.<br>
<br>
(3) A complete specification<br>
<br>
See the documentation and language edits in the patch for<br>
the specification (also copied below for inline replies).<br>
<br>
(4) Representation within the appropriate governing organization<br>
<br>
It doesn't seem appropriate to try to standardise the extension<br>
at this stage, since the only way to use the extension is through<br>
target-specific interfaces. The extension doesn't provide any<br>
benefit that's independent of those interfaces.<br>
<br>
So at the moment this is really in the realm of target-specific<br>
language extensions rather than generic language extensions.<br>
This may of course change later.<br>
<br>
(5) A long-term support plan<br>
<br>
Arm is very much committed to supporting this.<br>
<br>
(6) A high-quality implementation<br>
<br>
I'd like feedback on whether the current patch qualifies. :-)<br>
<br>
(7) A proper test suite<br>
<br>
The tests in the patch cover each functional change to the source,<br>
except as noted in the patch description. The implementation of the<br>
SVE ACLE will provide further coverage.<br>
<br>
Following a suggestion from Renato in a different context, I've now<br>
put the main discussion and justification in the documentation part<br>
of the patch. I've copied it below as well for inline replies.<br>
<br>
Thanks,<br>
Richard<br>
<br>
<br>
<br>
==============<br>
Sizeless types<br>
==============<br>
<br>
As an extension, Clang supports the concept of “sizeless” object types in<br>
both C and C++. The types are so called because it is an error to measure<br>
their size directly using ``sizeof`` or indirectly via operations like<br>
pointer arithmetic.<br>
<br>
Forbidding ``sizeof`` and related operations means that the amount of<br>
data that the types contain does not need to be a compile-time constant.<br>
It can instead depend on runtime properties, and for example can adapt<br>
to different hardware configurations.<br>
<br>
Sizeless types are only intended for objects that hold temporary working<br>
data, such as “scalable” or variable-length vectors. They are not<br>
intended for long-term storage and cannot be used in aggregates.<br>
<br>
At present, the only sizeless types that Clang provides are:<br>
<br>
AArch64 SVE vector types<br>
These vector types are built into the compiler under names like<br>
``__SVInt8_t``, as required by the `Procedure Call Standard for the<br>
Arm® 64-bit Architecture`_. They represent the longest vector of a<br>
particular element type that can be stored in an SVE vector register.<br>
Functions can pass and return these vectors in registers.<br>
<br>
The header file ``<arm_sve.h>`` makes the types available under more<br>
user-friendly names like ``svint8_t``. It also provides a set of<br>
intrinsic functions for operating on the types. See the `ARM C<br>
Language Extensions for SVE`_ for more information about these types<br>
and intrinsics.<br>
<br>
.. _Procedure Call Standard for the Arm® 64-bit Architecture:<br>
<a href="https://developer.arm.com/docs/ihi0055/latest/">https://developer.arm.com/docs/ihi0055/latest/</a><br>
.. _ARM C Language Extensions for SVE:<br>
<a href="https://developer.arm.com/docs/100987/latest">https://developer.arm.com/docs/100987/latest</a><br>
<br>
`ARM C Language Extensions for SVE`_ contains the original specification of<br>
sizeless types, but the description below is intended to be self-contained.<br>
<br>
Outline of the type system changes<br>
==================================<br>
<br>
C and C++ classify object types as “complete” (the size of objects<br>
of that type can be calculated) or “incomplete” (the size of objects<br>
of that type cannot be calculated). There is very little you can do with<br>
a type until it becomes complete.<br>
<br>
This categorization implicitly ties two concepts: whether it is possible<br>
to manipulate objects of a particular type, and whether it is possible<br>
to measure their size (which in C++ must be constant). The key idea<br>
behind the sizeless type extension is to split these concepts apart.<br>
<br>
To do this, the extension classifies types as:<br>
<br>
* “indefinite” (lacking sufficient information to create an object of<br>
that type) or “definite” (having sufficient information)<br>
<br>
* “sized” (will have a measurable size when definite) or “sizeless”<br>
(will never have a measurable size)<br>
<br>
* “incomplete” (lacking sufficient information to determine the size of<br>
objects of that type) or “complete” (having sufficient information)<br>
<br>
where the wording for the final bullet is taken verbatim from the<br>
C standard. All standard types are “sized” (even ``void``, although<br>
it is always indefinite).<br>
<br>
The idea is that “definite” types are as fully-defined as they<br>
ever can be, even if their size is still not known at compile time.<br>
“Complete” is then equivalent to “sized and definite”.<br>
<br>
On its own, this puts sizeless types into a similar position<br>
to incomplete structure types, which is conservatively correct<br>
but severely limits what the types can do.<br>
<br>
The next step is to relax certain rules so that they use the distinction<br>
between “indefinite” and “definite” rather than “incomplete” and “complete”.<br>
The goal of this process is to allow:<br>
<br>
* automatic variables with sizeless type<br>
* function parameters and return values with sizeless type<br>
* use of sizeless types with ``_Generic``<br>
* pointers to sizeless types<br>
* applying ``typeid`` to a sizeless type<br>
* use of sizeless types with C++ type traits<br>
<br>
In contrast, the following must remain invalid, by keeping the usual rules<br>
for incomplete types unchanged:<br>
<br>
* using ``sizeof``, ``_Alignof`` and ``alignof`` with a sizeless type<br>
(or object of sizeless type)<br>
* creating or accessing arrays that have sizeless type<br>
* doing pointer arithmetic on pointers to sizeless types<br>
* unions or structures with sizeless members<br>
* applying ``_Atomic`` to a sizeless type<br>
* throwing or catching objects of sizeless type<br>
* capturing sizeless objects by value in lambda expressions<br>
<br>
There is also an extra restriction:<br>
<br>
* variables with sizeless type must not have static or thread-local<br>
storage duration<br>
<br>
In practice it is impossible to *define* such variables with incomplete type,<br>
but having an explicit rule means that things like:<br>
<br>
.. code-block:: c<br>
<br>
extern __SVInt8_t foo;<br>
<br>
are outright invalid rather than simply useless (because no other<br>
translation unit could ever define ``foo``). Similarly, without an<br>
explicit rule:<br>
<br>
.. code-block:: c<br>
<br>
__SVInt8_t foo;<br>
<br>
would be a valid tentative definition at the point it occurs and only<br>
become invalid at the end of the translation unit, because ``__SVInt8_t``<br>
is never completed.<br>
<br>
Edits to the standards<br>
======================<br>
<br>
Edits to the C standard<br>
-----------------------<br>
<br>
This section specifies the behavior for sizeless types in C, as an edit<br>
to the N1570 draft of C11.<br>
<br>
6.2.5 Types<br>
~~~~~~~~~~~<br>
<br>
In 6.2.5p1, replace:<br>
<br>
At various points within a translation unit an object type may be<br>
*incomplete* …<br>
<br>
onwards with:<br>
<br>
Object types are further partitioned into *sized* and *sizeless*; all<br>
basic and derived types defined in this standard are sized, but an<br>
implementation may provide additional sizeless types.<br>
<br>
and add two additional clauses:<br>
<br>
* At various points within a translation unit an object type may be<br>
*indefinite* (lacking sufficient information to construct an object<br>
of that type) or *definite* (having sufficient information).<br>
An object type is said to be *complete* if it is both sized and<br>
definite; all other object types are said to be *incomplete*.<br>
Complete types have sufficient information to determine the size<br>
of an object of that type while incomplete types do not.<br>
<br>
* Arrays, structures, unions and enumerated types are always sized,<br>
so for them the term *incomplete* is equivalent to (and used<br>
interchangeably with) the term *indefinite*.<br>
<br>
Change 6.2.5p19 to:<br>
<br>
The void type comprises an empty set of values; it is a sized<br>
indefinite object type that cannot be completed (made definite).<br>
<br>
Replace “incomplete” with “indefinite” and “complete” with “definite” in<br>
6.2.5p37, which describes how a type's state can change throughout a<br>
translation unit.<br>
<br>
6.3.2.1 Lvalues, arrays, and function designators<br>
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~<br>
<br>
Replace “incomplete” with “indefinite” in 6.3.2.1p1, so that sizeless<br>
definite types are modifiable lvalues.<br>
<br>
Make the same replacement in 6.3.2.1p2, to prevent undefined behavior<br>
when lvalues have sizeless definite type.<br>
<br>
6.5.1.1 Generic selection<br>
~~~~~~~~~~~~~~~~~~~~~~~~~<br>
<br>
Replace “complete object type” with “definite object type” in 6.5.1.1p2,<br>
so that the type name in a generic association can be a sizeless definite<br>
type.<br>
<br>
6.5.2.2 Function calls<br>
~~~~~~~~~~~~~~~~~~~~~~<br>
<br>
Replace “complete object type” with “definite object type” in 6.5.2.2p1,<br>
so that functions can return sizeless definite types.<br>
<br>
Make the same change in 6.5.2.2p4, so that arguments can also have<br>
sizeless definite type.<br>
<br>
6.5.2.5 Compound literals<br>
~~~~~~~~~~~~~~~~~~~~~~~~~<br>
<br>
Replace “complete object type” with “definite object type” in 6.5.2.5p1,<br>
so that compound literals can have sizeless definite type.<br>
<br>
6.7 Declarations<br>
~~~~~~~~~~~~~~~~<br>
<br>
Insert the following new clause after 6.7p4:<br>
<br>
* If an identifier for an object does not have automatic storage duration,<br>
its type must be sized rather than sizeless.<br>
<br>
Replace “complete” with “definite” in 6.7p7, which describes when the<br>
type of an object becomes definite.<br>
<br>
6.7.6.3 Function declarators (including prototypes)<br>
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~<br>
<br>
Replace “incomplete type” with “indefinite type” in 6.7.6.3p4, so that<br>
parameters can also have sizeless definite type.<br>
<br>
Make the same change in 6.7.6.3p12, which allows even indefinite types<br>
to be function parameters if no function definition is present.<br>
<br>
6.7.9 Initialization<br>
~~~~~~~~~~~~~~~~~~~~<br>
<br>
Replace “complete object type” with “definite object type” in 6.7.9p3,<br>
to allow initialization of identifiers with sizeless definite type.<br>
<br>
6.9.1 Function definitions<br>
~~~~~~~~~~~~~~~~~~~~~~~~~~<br>
<br>
Replace “complete object type” with “definite object type” in 6.9.1p3,<br>
so that functions can return sizeless definite types.<br>
<br>
Make the same change in 6.9.1p7, so that adjusted parameter types can be<br>
sizeless definite types.<br>
<br>
J.2 Undefined behavior<br>
~~~~~~~~~~~~~~~~~~~~~~<br>
<br>
Update the entries that refer to the clauses above.<br>
<br>
Edits to the C++ standard<br>
-------------------------<br>
<br>
This section specifies the behavior for sizeless types in C++,<br>
as an edit to the N3797 draft of C++17.<br>
<br>
3.1 Declarations and definitions [basic.def]<br>
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~<br>
<br>
Replace “incomplete” with “indefinite” in [basic.def]p5, so that definitions<br>
of an object can give it sizeless definite type. Add a further clause<br>
after [basic.def]p5:<br>
<br>
* A program is ill-formed if any declaration of an object gives it both<br>
a sizeless type and either static or thread-local storage duration.<br>
<br>
3.9 Types [basic.types]<br>
~~~~~~~~~~~~~~~~~~~~~~~<br>
<br>
Replace [basic.types]p5 with:<br>
<br>
A class that has been declared but not defined, an enumeration type<br>
in certain contexts (7.2), or an array of unknown size or of<br>
indefinite element type, is an indefinite object type.45)<br>
Indefinite object types and the void types are indefinite types (3.9.1).<br>
Objects shall not be defined to have an indefinite type.<br>
<br>
and add three additional clauses:<br>
<br>
* Object and void types are further partitioned into *sized* and *sizeless*;<br>
all basic and derived types defined in this standard are sized, but an<br>
implementation may provide additional sizeless types.<br>
<br>
* An object or void type is said to be *complete* if it is both sized and<br>
definite; all other object and void types are said to be *incomplete*.<br>
The term *completely-defined object type* is synonymous with *complete<br>
object type*.<br>
<br>
* Arrays, class types and enumeration types are always sized, so for<br>
them the term *incomplete* is equivalent to (and used interchangeably<br>
with) the term *indefinite*.<br>
<br>
(Note that the wording of footnote 45 continues to apply as-is.)<br>
<br>
Also replace “incomplete” with “indefinite” in the forward reference<br>
in [basic.types]p7.<br>
<br>
3.9.1 Fundamental Types [basic.fundamental]<br>
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~<br>
<br>
In [basic.fundamental]p9, replace the second sentence with:<br>
<br>
The void type is a sized indefinite type that cannot be completed<br>
(made definite).<br>
<br>
leaving the rest of the clause unchanged.<br>
<br>
3.9.2. Compound Types [basic.compound]<br>
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~<br>
<br>
In this part of [basic.compound]p3:<br>
<br>
Pointers to incomplete types are allowed although there are<br>
restrictions on what can be done with them …<br>
<br>
add “(including indefinite types)” after “incomplete types”.<br>
<br>
3.10 Lvalues and rvalues [basic.lval]<br>
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~<br>
<br>
Replace “complete” with “definite” and “incomplete” with “indefinite” in<br>
[basic.lval]p4, so that prvalues can have definite type and (in contrast)<br>
glvalues can have indefinite type.<br>
<br>
Replace “incomplete” with “indefinite” and “complete” with “definite” in<br>
[basic.lval]p7, so that the target of a pointer can be modifiable if it has<br>
sizeless definite type.<br>
<br>
4.1 Lvalue-to-rvalue conversion [conv.lval]<br>
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~<br>
<br>
Replace “incomplete” with “indefinite” in [conv.lval]p1, so that sizeless<br>
definite glvalues can be converted to prvalues.<br>
<br>
5.2.2 Function call [expr.call]<br>
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~<br>
<br>
Replace “completely-defined” with “definite” and “incomplete class type” with<br>
“indefinite type” in [expr.call]p4, so that parameters can have sizeless<br>
definite type.<br>
<br>
Replace “incomplete” with “indefinite” and “complete” with “definite” in<br>
[expr.call]p11, so that function call prvalues can have sizeless definite type.<br>
<br>
5.2.3 Explicit type conversion (function notation) [expr.type.conv]<br>
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~<br>
<br>
Replace “complete” with “definite” in [expr.type.conv]p2, so that ``T()``<br>
can be used for sizeless definite T.<br>
<br>
5.3.1 Unary operators [expr.unary.op]<br>
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~<br>
<br>
Replace “incomplete” with “indefinite” in [expr.unary.op]p1, so that a<br>
dereferenced pointer to a sizeless definite object can be converted to<br>
a prvalue.<br>
<br>
5.3.5 Delete [expr.delete]<br>
~~~~~~~~~~~~~~~~~~~~~~~~~~<br>
<br>
After the first sentence in [expr.delete]p2 (which describes converting an<br>
operand with class type to a pointer type), add:<br>
<br>
The type of the operand must now be a pointer to a sized type,<br>
otherwise the program is ill-formed.<br>
<br>
7.1.6.2 Simple type specifiers [dcl.type.simple]<br>
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~<br>
<br>
Replace “complete” with “definite” in [dcl.type.simple]p5, so that the special<br>
treatment for decltypes of function calls applies to indefinite rather<br>
than incomplete return types. This is for consistency with the change<br>
to [expr.call]p11 above.<br>
<br>
8.3.4 Arrays [dcl.array]<br>
~~~~~~~~~~~~~~~~~~~~~~~~<br>
<br>
In [dcl.array]p1, add “a sizeless type” to the list of things that array<br>
element type T cannot be.<br>
<br>
9.4.2 Static data members [class.static.data]<br>
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~<br>
<br>
Replace “an incomplete type” with “a sized indefinite type” in<br>
[class.static.data]p2, to avoid giving the impression that static data<br>
members can have sizeless type.<br>
<br>
Make this explicit by adding the following after [class.static.data]p7:<br>
<br>
* A static data member shall not have sizeless type.<br>
<br>
14.3.1 Template type parameters [temp.arg.type]<br>
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~<br>
<br>
Replace “incomplete” with “indefinite” in [temp.arg.type]p2, which notes that<br>
template type parameters need not be fully defined.<br>
<br>
14.7.1 Implicit instantiation [temp.inst]<br>
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~<br>
<br>
Replace “completely-defined object type” with “definite object type”<br>
in [temp.inst]p1 and [temp.inst]p6, so that the language edits do not affect<br>
the rules for implicit instantiation.<br>
<br>
17.6.4.8 Other functions [res.on.functions]<br>
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~<br>
<br>
Replace “incomplete” with “incomplete or indefinite” in [res.on.functions]p2,<br>
so that the library requires the rest of the program to honor the rules<br>
for both categories of type.<br>
<br>
20.10.4.3 Type properties [meta.unary.prop]<br>
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~<br>
<br>
Replace “complete” with “definite” in [meta.unary.prop]p3 and in the table<br>
that follows. This specifically includes ``is_destructible``; since sizeless<br>
definite types can have automatic storage duration, it must be possible<br>
to destroy them. The changes are redundant but harmless for cases in<br>
which the completeness rule applies only to class types.<br>
<br>
20.10.6 Relationships between types [meta.rel]<br>
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~<br>
<br>
Replace “complete” with “definite” in table 51.<br>
<br>
20.10.7.6 Other transformations [meta.trans.other]<br>
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~<br>
<br>
Replace “complete” with “definite” in table 57.<br>
<br>
Notes for Clang developers<br>
==========================<br>
<br>
Applying the extension to other cases<br>
-------------------------------------<br>
<br>
The summary and standard edits above describe how the sizeless type<br>
extension interacts with the core parts of the C and C++ standards.<br>
However, Clang supports many other extensions to the core languages,<br>
and will support new versions of the core languages as they evolve<br>
over time. It is therefore necessary to describe how sizeless types<br>
should interact with future extensions and language developments.<br>
<br>
The general principle is that we should continue to keep using the<br>
distinction between incomplete types and complete types unless there is<br>
a specific known benefit to doing otherwise. Treating sizeless types as<br>
incomplete types should be the conservatively correct choice in almost<br>
all cases. We can later decide to relax specific rules to use the<br>
distinction between indefinite and definite types once we are sure<br>
that that is the right thing to do.<br>
<br>
Note that no decision needs to be made for any rules that are specific<br>
to complete or incomplete aggregates (arrays, structs, unions or classes),<br>
since aggregates are always sized.<br>
<br>
Rationale for this extension<br>
============================<br>
<br>
Requirements<br>
------------<br>
<br>
The main question that prompted this extension was: how do we add<br>
scalable vector types to the type system? The key requirements were:<br>
<br>
* The approach must work in both C and C++.<br>
<br>
* It must be possible to define automatic variables with these types.<br>
<br>
* It must be possible to pass and return objects of these types<br>
(since that is what intrinsics and vector library routines need to do).<br>
<br>
* It must be possible to use the types in ``_Generic`` associations<br>
(since the SVE ACLE uses ``_Generic`` to provide ``tgmath.h``\ -style<br>
overloads).<br>
<br>
* It must be possible to create pointers or references to the types<br>
(for passing or returning by pointer or reference, and because not<br>
allowing references would be semantically difficult in C++).<br>
<br>
Possible approaches<br>
-------------------<br>
<br>
Any approach to defining scalable types would fall into one of three<br>
categories:<br>
<br>
(1) Limit the types in such a way that there is no concept of size.<br>
<br>
(2) Define the size of the types to be variable.<br>
<br>
(3) Define the size of the types to be constant, either with the<br>
constant being large enough for all possible vector lengths or<br>
with the types pointing to separate memory (as for C++ classes<br>
like ``std::string``).<br>
<br>
\ (2) seemed initially appealing since C already has the concept of<br>
variable-length arrays. However, variable-length built-in types<br>
would work in a significantly different way. Arrays often decay to<br>
pointers (which of course are fixed-length types), whereas vector<br>
types never would. Unlike arrays, it should be possible to pass<br>
variable-length vectors to functions, return them from functions,<br>
and assign them by value.<br>
<br>
One particular difficulty is that the semantics of variable-length arrays<br>
rely on having a point at which the array size is evaluated. It would<br>
be difficult to extend this approach to built-in types, or to declarations<br>
of functions that return variable-length types. It would also not be an<br>
accurate model of how an implementation actually behaves, since the<br>
implementation would not evaluate the vector lengths at these points and<br>
would not react to the results of the calculation.<br>
<br>
As well as the extension itself being relatively complex (especially<br>
for C++), it might be difficult to define it in a way that interacts<br>
naturally with other extensions. Also, variable-length arrays were added<br>
to an early draft of C++14, but were later removed as too controversial and<br>
did not make it into the final standard. C++17 still requires ``sizeof``<br>
to be constant and C11 makes variable-length arrays optional.<br>
<br>
\ (2) therefore felt like a complicated dead-end.<br>
<br>
\ (3) can be divided into two parts:<br>
<br>
a) The vector types have a constant size and are large enough for all<br>
possible vector lengths.<br>
<br>
The main problem with this approach is that the maximum SVE vector<br>
length of 2048 bits is much larger than the minimum of 128 bits. Using<br>
a fixed size of 2048 bits would be extremely inefficient for smaller<br>
vector lengths, and of course the whole point of using vectors is to<br>
make things *more* efficient.<br>
<br>
Also, we would need to define the types such that only the bytes<br>
associated with the actual vector length are significant. This would<br>
make it possible to pass or return the types in registers and treat<br>
them as register values when copying. This perhaps has some similarity<br>
with overaligned structures such as:<br>
<br>
.. code-block:: c<br>
<br>
struct s { _Alignas(16) int i; };<br>
<br>
except that the amount of padding is only known at runtime.<br>
<br>
There is also a significant conceptual problem: encoding a fixed size<br>
goes against the guiding principle of SVE, in which there is no preferred<br>
vector length. There is nothing particularly magical about the current<br>
limit of 2048 bits and it would be better to avoid an ABI break if the<br>
maximum ever did increase in future.<br>
<br>
b) The vector types have a constant size and refer to separate storage<br>
(as for C++ classes like ``std::string``).<br>
<br>
This would be difficult to do without C++-style constructor, destructor,<br>
copy and move semantics, so would not work well in C. And in C++ it would<br>
be less efficient than the other approaches, since presumably an allocator<br>
would be needed to allocate the separate storage. It would be difficult<br>
to map this kind of type to a self-contained register-based ABI type.<br>
<br>
These are all negative reasons for (1) being the best approach.<br>
A more positive justification is that (1) seems to meet the requirements<br>
in the most efficient way possible. The vectors can use their natural<br>
(native) representation, and the type system prevents uses that would<br>
make that representation problematic.<br>
<br>
Also, the approach of starting with very restricted types and then<br>
specifically allowing certain things should be more future-proof<br>
and interact better with other (unseen) language extensions. By default,<br>
any language extension would treat the new types like other incomplete<br>
types and choose conservatively-correct behavior. It would then be<br>
possible to relax the rules if this default behavior turns out to be<br>
too restrictive.<br>
<br>
(That said, treating the types as permanently incomplete will<br>
not avoid all clashes with other extensions. For example, we need to<br>
allow objects of automatic storage duration to have certain forms of<br>
incomplete type, whereas an extension might implicitly assume that all<br>
such objects must already have complete type. The approach should still<br>
avoid the worst effects though.)<br>
_______________________________________________<br>
cfe-dev mailing list<br>
cfe-dev@lists.llvm.org<br>
<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev">https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a><br>
</div>
</span></font></div>
</div>
</body>
</html>