<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<p><font size="-1">Chris, As I have been following this thread, it
looks like you could get close to a win by keeping VectorType
with your additional changes. Whereas a firm position against
VectorType tends to make a win in doubt. Your changes could sit
there for some time.</font></p>
<p><font size="-1">The question does not appear to be that your
changes are not ideal in an eventual sense, but that going from
where we are now to that eventual condition needs to be done
carefully, done incrementally. The solution appears to be how to
get agreement on an incremental path. How to consolidate what
gain you can with a first step and then seeing what can be done
next.</font></p>
<p><font size="-1">Neil Nelson<br>
</font></p>
<div class="moz-cite-prefix"><font size="-1">On 5/21/20 2:01 PM,
Chris Tetreault via llvm-dev wrote:</font><br>
</div>
<blockquote type="cite"
cite="mid:BYAPR02MB4551A27BD883D2204CDD8D8ADAB70@BYAPR02MB4551.namprd02.prod.outlook.com">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta name="Generator" content="Microsoft Word 15 (filtered
medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
code
{mso-style-priority:99;
font-family:"Courier New";}
p.msonormal0, li.msonormal0, div.msonormal0
{mso-style-name:msonormal;
mso-margin-top-alt:auto;
margin-right:0in;
mso-margin-bottom-alt:auto;
margin-left:0in;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
span.EmailStyle21
{mso-style-type:personal;
font-family:"Calibri",sans-serif;
color:windowtext;}
span.EmailStyle22
{mso-style-type:personal-compose;
font-family:"Calibri",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
<div class="WordSection1">
<p class="MsoNormal">Hi John,<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"> I’d like to address some points in your
message.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">> Practically speaking, this is going to
break every out-of-tree frontend, backend, or optimization
pass that supports SIMD types.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">My understanding is that the policy in LLVM
development is that we do not let considerations for
downstream and out-of-tree codebases affect the pace of
development. The C++ API is explicitly unstable. I maintain a
downstream fork of LLVM myself, so I know the pain that this
is causing because I get to fix all the issues in my internal
codebase. However, the fix for downstream codebases is very
simple: Just find all the places where it says VectorType, and
change it to say FixedVectorType.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">> … by having the VectorType type
semantically repurposed out from under them.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">The documented semantics of VectorType
prior to my RFC were that it is a generalization of all vector
types. The VectorType contains an ElementCount, which is a
pair of (bool, unsigned). If the bool is true, then the return
value of getNumElements() is the minimum number of vector
elements. If the bool is false, then it is the actual number
of elements. My RFC has not changed these semantics. It will
eventually delete a function that has been pervasively misused
throughout the codebase, but the semantics remain the same.
You are proposing a semantic change to VectorType to have it
specifically be a fixed width vector.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">> … a particular largely-vendor-specific
extension …<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">All SIMD vectors are vendor specific
extensions. Just because most of the most popular
architectures have them does not make this not true. AArch64
and RISC-V have scalable vectors, so it is not just one
architecture. It is the responsibility of all developers to
ensure that they use the types correctly. It would be nice if
the obvious thing to do is the correct thing to do.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">> … it’s much better for code that does
support both to explicitly opt in by checking for and handling
the more general type …<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">This is how it will work. I am in the
process of fixing up call sites that make fixed width
assumptions so that they use FixedVectorType.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"> I think that it is important to ensure
that things have clear sensible names, and to clean up
historical baggage when the opportunity presents. The
advantage of an unstable API is that you are not prevented
from changing this sort of thing by external entities. Now is
the time to fix the names of the vector types; a new type of
vector exists, and we have the choice to change the names to
reflect the new reality or to accumulate some technical debt.
If we wait to address this issue later, there will just be
more code that needs to be addressed. The refactor is fairly
easy right now because pretty much everything is making fixed
width assumptions. The changes are all fairly mechanical. If
we wait until scalable vectors are well supported throughout
the codebase, the change will just be that much harder.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Thanks,<o:p></o:p></p>
<p class="MsoNormal"> Christopher Tetreault<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<div style="border:none;border-top:solid #E1E1E1
1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b>From:</b> John McCall
<a class="moz-txt-link-rfc2396E" href="mailto:rjmccall@apple.com"><rjmccall@apple.com></a> <br>
<b>Sent:</b> Thursday, May 21, 2020 11:15 AM<br>
<b>To:</b> Chris Tetreault <a class="moz-txt-link-rfc2396E" href="mailto:ctetreau@quicinc.com"><ctetreau@quicinc.com></a><br>
<b>Cc:</b> <a class="moz-txt-link-abbreviated" href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><br>
<b>Subject:</b> [EXT] Re: [llvm-dev] [RFC] Refactor class
hierarchy of VectorType in the IR<o:p></o:p></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<div>
<p><span style="font-family:"Arial",sans-serif">On
9 Mar 2020, at 15:05, Chris Tetreault via llvm-dev
wrote:<o:p></o:p></span></p>
</div>
<div>
<blockquote style="border:none;border-left:solid #777777
1.5pt;padding:0in 0in 0in
4.0pt;margin-left:0in;margin-right:0in;margin-bottom:3.75pt">
<p><span
style="font-family:"Arial",sans-serif;color:#777777">Hi,<br>
I am helping with the effort to implement scalable
vectors in the codebase in order to add support for
generating SVE code in the Arm backend. I would like
to propose a refactor of the Type class hierarchy in
order to eliminate issues related to the misuse of
SequentialType::getNumElements(). I would like to
introduce a new class FixedVectorType that inherits
from SequentialType and VectorType. VectorType would
no longer inherit from SequentialType, instead
directly inheriting from Type. After this change, it
will be statically impossible to accidentally call
SequentialType::getNumElements() via a VectorType
pointer.<o:p></o:p></span></p>
</blockquote>
</div>
<div>
<p><span style="font-family:"Arial",sans-serif">I’m
sorry that I missed this thread when you posted it. I’m
very much in favor of changing the type hierarchy to
statically distinguish fixed from scalable vector types,
but I think that making VectorType the common base type
is unnecessarily disruptive. Practically speaking, this
is going to break every out-of-tree frontend, backend,
or optimization pass that supports SIMD types.
Relatively little LLVM code will just naturally support
scalable vector types without any adjustment. Following
the principle of iterative development, as well as just
good conservative coding practice, it’s much better for
code that does support both to explicitly opt in by
checking for and handling the more general type, rather
than being implicitly “volunteered” to support both by
having the </span><code><span
style="font-size:10.0pt;color:black;background:#F7F7F7">VectorType</span></code><span
style="font-family:"Arial",sans-serif"> type
semantically repurposed out from under them.<o:p></o:p></span></p>
<p><span style="font-family:"Arial",sans-serif">I
understand the argument that </span>
<code><span
style="font-size:10.0pt;color:black;background:#F7F7F7">VectorType</span></code><span
style="font-family:"Arial",sans-serif"> is a
better name for the abstract base type, but in this case
I don’t think that consideration justifies the
disruption for the vast majority of LLVM developers.
There are plenty of names you could give the abstract
base type that adequately express that’s a more general
type, and the historical baggage of
</span><code><span
style="font-size:10.0pt;color:black;background:#F7F7F7">VectorType</span></code><span
style="font-family:"Arial",sans-serif"> being
slightly misleadingly named if you’re aware of a
particular largely-vendor-specific extension does not
seem overbearing.<o:p></o:p></span></p>
<p><span style="font-family:"Arial",sans-serif">John.<o:p></o:p></span></p>
</div>
<div>
<blockquote style="border:none;border-left:solid #777777
1.5pt;padding:0in 0in 0in
4.0pt;margin-left:0in;margin-right:0in;margin-bottom:3.75pt">
<p><span
style="font-family:"Arial",sans-serif;color:#777777">Background:<br>
Recently, scalable vectors have been introduced into
the codebase. Previously, vectors have been written
<n x ty> in IR, where n is a fixed number of
elements known at compile time, and ty is some type.
Scalable vectors are written <vscale x n x ty>
where vscale is a runtime constant value. A new
function has been added to VectorType (defined in
llvm/IR/DerivedTypes.h), getElementCount(), that
returns an ElementCount, which is defined as such in
llvm/Support/TypeSize.h:<br>
class ElementCount {<br>
public:<br>
unsigned Min;<br>
bool Scalable;<br>
...<br>
}<br>
Min is the minimum number of elements in the vector
(the "n" in <vscale x n x ty>), and Scalable is
true if the vector is scalable (true for <vscale x
n x ty>, false for <n x ty>) The idea is that
if a vector is not scalable, then Min is exactly equal
to the number of vector elements, but if the vector is
scalable, then the number of vector elements is equal
to some runtime-constant multiple of Min. The key
takeaway here is that scalable vectors and fixed
length vectors need to be treated differently by the
compiler. For a fixed length vector, it is valid to
iterate over the vector elements, but this is
impossible for a scalable vector.<br>
Discussion:<br>
The trouble is that all instances of VectorType have
getNumElements() inherited from SequentialType. Prior
to the introduction of scalable vectors, this function
was guaranteed to return the number of elements in a
vector or array. Today, there is a comment that
documents the fact that this returns only the minimum
number of elements for scalable vectors, however there
exists a ton of code in the codebase that is now
misusing getNumElements(). Some examples:<br>
Auto *V = VectorType::get(Ty,
SomeOtherVec->getNumElements());<br>
This code was previously perfectly fine but is
incorrect for scalable vectors. When scalable vectors
were introduced VectorType::get() was refactored to
take a bool to tell if the vector is scalable. This
bool has a default value of false. In this example,
get() is returning a non-scalable vector even if
SomeOtherVec was scalable. This will manifest later in
some unrelated code as a type mismatch between a
scalable and fixed length vector.<br>
for (unsigned I = 0; I <
SomeVec->getNumElements(); ++I) { ... }<br>
Previously, since there was no notion of scalable
vectors, this was perfectly reasonable code. However,
for scalable vectors, this is always a bug.<br>
With vigilance in code review, and good test coverage
we will eventually find and squash most of these bugs.
Unfortunately, code review is hard, and test coverage
isn't perfect. Bugs will continue to slip through as
long as it's easier to do the wrong thing.<br>
One other factor to consider, is that there is a great
deal of code which deals exclusively with fixed length
vectors. Any backend for which there are no scalable
vectors should not need to care about their existence.
Even in Arm, if Neon code is being generated, then the
vectors will never be scalable. In this code, the
current status quo is perfectly fine, and any code
related to checking if the vector is scalable is just
noise.<br>
Proposal:<br>
In order to support users who only need fixed width
vectors, and to ensure that nobody can accidentally
call getNumElements() on a scalable vector, I am
proposing the introduction of a new FixedVectorType
which inherits from both VectorType and
SequentialType. In turn, VectorType will no longer
inherit from SequentialType. An example of what this
will look like, with some misc. functions omitted for
clarity:<br>
class VectorType : public Type {<br>
public:<br>
static VectorType *get(Type *ElementType, ElementCount
EC);<br>
<br>
Type *getElementType() const;<br>
ElementCount getElementCount() const;<br>
bool isScalable() const;<br>
};<br>
<br>
class FixedVectorType : public VectorType, public
SequentialType {<br>
public:<br>
static FixedVectorType *get(Type *ElementType,
unsigned NumElts);<br>
};<br>
<br>
class SequentialType : public CompositeType {<br>
public:<br>
uint64_t getNumElements() const { return NumElements;
}<br>
};<br>
In this proposed architecture, VectorType does not
have a getNumElements() function because it does not
inherit from SequentialType. In generic code, users
will call VectorType::get() to obtain a new instance
of VectorType just as they always have. VectorType
implements the safe subset of functionality of fixed
and scalable vectors that is suitable for use
anywhere. If the user passes false to the scalable
parameter of get(), they will get an instance of
FixedVectorType back. Users can then inspect its type
and cast it to FixedVectorType using the usual
mechanisms. In code that deals exclusively in fixed
length vectors, the user can call
FixedVectorType::get() to directly get an instance of
FixedVectorType, and their code can remain largely
unchanged from how it was prior to the introduction of
scalable vectors. At this time, there exists no use
case that is only valid for scalable vectors, so no
ScalableVectorType is being added.<br>
With this change, in generic code it is now impossible
to accidentally call getNumElements() on a scalable
vector. If a user tries to pass a scalable vector to a
function that expects a fixed length vector, they will
encounter a compilation failure at the site of the
bug, rather than a runtime error in some unrelated
code. If a user attempts to cast a scalable vector to
FixedVectorType, the cast will fail at the call site.
This will make it easier to track down all the places
that are currently incorrect, and will prevent future
developers from introducing bugs by misusing
getNumElements().<br>
Outstanding design choice:<br>
One issue with this architecture as proposed is the
fact that SequentialType (by way of CompositeType)
inherits from Type. This introduces a diamond
inheritance in FixedVectorType. Unfortunately,
llvm::cast uses a c-style cast internally, so we
cannot use virtual inheritance to resolve this issue.
Thus, we have a few options:<br>
<br>
1. Break CompositeType's inheritance on Type and
introduce functions to convert from a Type to a
CompositeType and vice versa. The conversion from
CompositeType is always safe because all instances of
CompositeType (StructType, ArrayType, and
FixedVectorType) are instances of Type. A
CompositeType can be cast to the most derived class,
then back to Type. The other way is not always safe,
so a function will need to be added to check if a Type
is an instance of CompositeType. This change is not
that big, and I have a prototype implementation up at
<a href="https://reviews.llvm.org/D75486"
moz-do-not-send="true"><span style="color:#777777">https://reviews.llvm.org/D75486</span></a>
([SVE] Make CompositeType not inherit from Type)<br>
* Pros: this approach would result in minimal changes
to the codebase. If the llvm casts can be made to work
for the conversion functions, then it would touch very
few files.<br>
* Cons: There are those who think that CompositeType
adds little value and should be removed. Now would be
an ideal time to do this. Additionally, the conversion
functions would be more complicated if we left
CompositeType in.<br>
2. Remove CompositeType and break SequentialType's
inheritance of Type. Add functions to convert a
SequentialType to and from Type. The conversion
functions would work the same as those in option 1
above. Currently, there exists only one class that
derives directly from CompositeType: StructType. The
functionality of CompositeType can be directly moved
into StructType, and APIs that use CompositeType can
directly use Type and cast appropriately. We feel that
this would be a fairly simple change, and we have a
prototype implementation up at <a
href="https://reviews.llvm.org/D75660"
moz-do-not-send="true"><span style="color:#777777">https://reviews.llvm.org/D75660</span></a>
(Remove CompositeType class)<br>
* Pros: Removing CompositeType would simplify the type
hierarchy. Leaving SequentialType in would simplify
some code and be more typesafe than having a
getSequentialNumElements on Type.<br>
* Cons: The value of SequentialType has also been
called into question. If we wanted to remove it, now
would be a good time. Conversion functions add
complexity to the design. Introduces additional
casting from Type.<br>
3. Remove CompositeType and SequentialType. Roll the
functions directly into the most derived classes. A
helper function can be added to Type to handle
choosing from FixedVectorType and ArrayType and
calling getNumElements():<br>
static unsigned getSequentialNumElements() {<br>
assert(isSequentialType()); // This already exists and
does the<br>
// right thing<br>
if (auto *AT = dyn_cast<ArrayType>(this))<br>
return AT->getNumElements();<br>
return
cast<FixedVectorType>(this)->getNumElements();<br>
}<br>
A prototype implementation of this strategy can be
found at <a href="https://reviews.llvm.org/D75661"
moz-do-not-send="true">
<span style="color:#777777">https://reviews.llvm.org/D75661</span></a>
(Remove SequentialType from the type heirarchy.)<br>
<br>
* Pros: By removing the multiple inheritance
completely, we greatly simplify the design and
eliminate the need for any conversion functions. The
value of CompositeType and SequentialType has been
called into question, and removing them now might be
of benefit to the codebase<br>
* Cons: getSequentialNumElements() has similar issues
to those that we are trying to solve in the first
place and potentially subverts the whole design.
Omitting getSequentialNumElements() would add lots of
code duplication. Introduces additional casting from
Type.<br>
I believe that all three of these options are
reasonable. My personal preference is currently option
2. I think that option 3's getSequentialNumElements()
subverts the design because every Type has
getSequentialNumElements(), it is tempting to just
call it. However, the cast will fail at the call site
in debug, and in release it will return a garbage
value rather than a value that works most of the time.
For option 1, the existence of CompositeType
complicates the conversion logic for little benefit.<br>
Conclusion:<br>
Thank you for your time in reviewing this RFC. Your
feedback on my work is greatly appreciated.<br>
<br>
Thank you,<br>
Christopher Tetreault<o:p></o:p></span></p>
</blockquote>
<p class="MsoNormal"><span
style="font-family:"Arial",sans-serif"><o:p> </o:p></span></p>
<blockquote style="border:none;border-left:solid #777777
1.5pt;padding:0in 0in 0in
4.0pt;margin-left:0in;margin-right:0in;margin-bottom:3.75pt">
<p><span
style="font-family:"Arial",sans-serif;color:#777777">_______________________________________________<br>
LLVM Developers mailing list<br>
<a href="mailto:llvm-dev@lists.llvm.org"
moz-do-not-send="true">llvm-dev@lists.llvm.org</a><br>
<a
href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev"
moz-do-not-send="true"><span style="color:#777777">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</span></a><o:p></o:p></span></p>
</blockquote>
</div>
</div>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<pre class="moz-quote-pre" wrap="">_______________________________________________
LLVM Developers mailing list
<a class="moz-txt-link-abbreviated" href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>
<a class="moz-txt-link-freetext" href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a>
</pre>
</blockquote>
</body>
</html>