[cfe-dev] Matrix Support in Clang

Wed Mar 25 13:39:46 PDT 2020

On 25 Feb 2020, at 16:02, Florian Hahn via cfe-dev wrote:

> Hi,
>
> We updated the proposal to use operators for most operations (with a 
> lot of help from Michael Spencer!).I’ve added the updated section 
> inline and the complete updated proposal can be found at the end of 
> the mail.
>
>
>> On Jan 18, 2020, at 12:01, Florian Hahn via cfe-dev 
>> <cfe-dev at lists.llvm.org> wrote:
>>
>> Thanks for the feedback! I’ve responded inline.
>>
>>> On 16 Jan 2020, at 23:06, Richard Smith <richard at metafoo.co.uk 
>>> <mailto:richard at metafoo.co.uk>> wrote:
>>>
>>> Do you anticipate providing builtin operators for matrices? If not, 
>>> then the utility of a dedicated type and `matrix_type` attribute 
>>> seems greatly diminished: the builtin matrix operators could instead 
>>> -- in principle -- operate on a suitable vector type (either as a 
>>> flat vector, matching the LLVM IR model, or as a vector of vectors, 
>>> to support two-dimensional indexing). I think your proposal should 
>>> express why those would be inferior choices (eg, do matrix types 
>>> have different calling conventions, alignment requirements, ... on 
>>> some target? Do you intend to provide matrix x matrix multiplication 
>>> and matrix x vector multiplication via the * operator in the 
>>> future?). Adding *only* builtin functions and no new matrix types 
>>> would be a substantial simplification in the proposal.
>>
>> I think it would make sense to provide builtin operators instead of 
>> the proposed builtins for math operations. Same for element 
>> insertion/extraction. However I am not sure how to provide the 
>> strided matrix load/store as operators. Would it be OK to just have 
>> builtins for those? The reason we went for builtins initially was 
>> that we thought that might make the proposal a bit more lightweight, 
>> but it sounds like builtin operators would be preferred with the 
>> type.
>>
>> I do not think ext_vector_type would be suitable for our proposal, as 
>> it matches LLVM’s vector alignment and the matrix type should match 
>> the alignment of the underlying data type, to allow easy interaction 
>> with existing matrix libraries.
>>
>> A vector of vectors should work in principle, as long as we could fix 
>> both dimensions on a type level. Not having the dimensions guaranteed 
>> by the type would have a negative impact on the user-experience I 
>> think, as we would, for example, loose the ability to type-check if 
>> the dimensions match the operators and users would have to provide 
>> the dimensions for certain operations. Also, it would make supporting 
>> 3+ dimensions a bit more tricky.
>
>
> Below is a proposal for matrix type element access & binary operators:
>
> Matrix Type Element Access Operator
>
> An expression of the form `postfix-expression 
> [expression][expression]` where the postfix-expression is of matrix 
> type is a matrix element access expression. expression shall not be a 
> comma expression, and shall be a prvalue of unscoped enumeration or 
> integral type. Given the expression E1[E2][E3], the result is an 
> lvalue of the same type as the underlying element type of the matrix 
> that refers to the value at E2 row and E3 column in the matrix. The 
> expression E1 is sequenced before E2 and E3. The expressions E2 and E3 
> are unsequenced.
>
> Note: We thought about providing an expression of the form  
> `postfix-expression [expression]` to access columns of a matrix. We 
> think that such an expression would be problematic once both column 
> and row major matrixes are supported: depending on the memory layout, 
> either accessing columns or rows can be done efficiently, but not 
> both. Instead, we propose to provide builtins to extract rows and 
> columns from a matrix. This makes the operations more explicit.
>
>
> Matrix Type Binary Operators
>
> Each matrix type supports the following binary operators: +, -, *, /, 
> and %. The * operator provides matrix multiplication, while +, -, /, 
> and % are performed element-wise. There are also scalar versions of 
> the operators, which take a matrix type and the underlying element 
> type. The operation is applied to all elements of the matrix using the 
> scalar value.
>
> The operands of +, -, *, and / shall have either matrix type, 
> arithmetic or unscoped enumeration type. The operands of % shall have 
> either matrix type with an element type of integral type, integral 
> type or unscoped enumeration type. At least one of the operands shall 
> be of matrix type.
>
> For BIN_OP in +, -, *, /, and %, given the expression M1 BIN_OP M2 
> where, for *, one of M1 or M2 is of arithmetic type:
>
> * The usual arithmetic conversions are applied to M1 and M2. [ Note: 
> if M1 or M2 are of arithmetic type, they are broadcast to matrices 
> here. — end note ]
>
> * The matrix types of M1 and M2 shall have the same number of rows and 
> columns.
>
> * The result is equivalent to Res in the following where col is the 
> number of columns and row is the number of rows in the matrix type:
>
> decltype(M1) Res;
> for (int C = 0; C < col; ++C)
>   for (int R = 0; R < row; ++R)
>     Res[R][C] = M1[R][C] BIN_OP M2[R][C];
>
>
> Given the expression M1 * M2 where M1 and M2 are of matrix type:
>
>  * The usual arithmetic conversions are applied to M1 and M2
>
>  * The type of M1 shall have the same number of columns as the type of 
> M2 has rows.
>
>  * The resulting type, MTy, is the result of applying the usual 
> arithmetic conversions to M1 and M2, but with the same number of rows 
> as M1’s matrix type and the same number of columns as M2’s matrix 
> type.
>
>  * The result is equivalent to Res in the following where col is the 
> number of columns and row is the number of rows in MTy:
>
> MTy Res;
> for (int C = 0; C < col; ++C) {
>   for (int R = 0; R < row; ++R) {
>     EltTy Elt = 0;
>     for (int K = 0; K < inner; ++K) {
>       Elt += M1[R][K] * M2[K][C];
>   }
>   Res[R][C] = Elt;
> }
>
>
> All operations on matrix types match the behavior of the underlying 
> element type with respect to signed overflows.
>
> With respect to rounding errors, the the * operator preserves the 
> behavior of the separate multiply and add operations by default. We 
> propose to provide a Clang option to override this behavior and allow 
> contraction of those operations (e.g. -ffp-contract=matrix).
>
>
>
>>> Have the Eigen developers indicated they would consider using this 
>>> if it were available to them?
>>> Have you reached out to the GCC developers to see if they would also 
>>> be likely to support this extension?
>>> We should be aiming to build critical mass behind this feature so 
>>> that it gets adopted; it would be a waste of resources if a 
>>> different technology ends up being adopted in this space and we're 
>>> left maintaining a system that no-one outside Apple uses.
>>
>> We hoped to get some initial feedback before reaching out, to make 
>> sure the proposal is in reasonably good shape. I plan to reach out to 
>> them early next week.
>
>
> I reached out on the Eigen mailing list and got an encouraging 
> response: 
> https://listengine.tuxfamily.org/lists.tuxfamily.org/eigen/2020/02/msg00000.html 
> <https://listengine.tuxfamily.org/lists.tuxfamily.org/eigen/2020/02/msg00000.html>
> I’ve also added people involved in GCC to the WIP patch adding the 
> type.
>
>
> Cheers,
> Florian
>
>
>
> Draft Specification
>
> Matrix Type Attribute
>
> The attribute-token matrix_type is used to declare a matrix type. It 
> shall appear at most once in each attribute-list. The attribute shall 
> only appertain to a typedef-name of a typedef of a non-volatile type 
> that is a signed integer type, an unsigned integer type, or a 
> floating-point type.

You should require the element type to be unqualified and then apply 
qualifiers “outside” of the matrix type.  That is, you can have a 
`const float4x4`, but not a `const_float4x4`.

> An attribute-argument-clause must be present and it shall have the 
> form:
>
> (constant-expression, constant-expression)
>
> Both constant-expressions shall be a positive non-zero integral 
> constant expressions. The maximum of the product of the constants is 
> implementation defined. If that implementation defined limit is 
> exceeded, the program is ill-formed.
>
> An attribute of the form matrix_type(R, C) forms a matrix type with an 
> element type of the cv-qualified type the attribute appertains to and 
> R rows and C columns.
>
> If a declaration of a typedef-name has a matrix_type attribute, then 
> all declaration of that typedef-name shall have a matrix_type 
> attribute with the same element type, number of rows, and number of 
> columns.
>
>
> Matrix Type
>
> A matrix type has an underlying element type, a constant number of 
> rows, and a constant number of columns. Matrix types with the same 
> element type, rows, and columns are the same type. A value of a matrix 
> type contains rows * columns values of the element type laid out in 
> column-major order without padding in a way compatible with an array 
> of at least that many elements of the underlying element type.
>
> A matrix type is a scalar type with the same alignment as its 
> underlying element type,

Why do you want to guarantee the same alignment as the element type?  
Isn’t that going to be seriously performance-limiting for most matrix 
types?

If your goal is to make it easy to load / store matrices from an 
existing buffer, I would recommend adding separate operations to do that 
instead of (I assume) expecting users to simply cast those buffers to a 
pointer-to-matrix type and dereference.  That will also allow you to use 
non-packed representations for matrices.  Furthermore, you can 
parameterize the operation by the row/column-major-ness of the buffer 
and then simply use a canonical representation for matrix types.

> but objects of matrix type are not usable in constant expressions.

This is contrary to the prevailing language directions of both C and 
C++.  It might be an acceptable temporary language restriction, but 
it’s not something you should specify.

> TODO: Allow reinterpret_cast from pointer to element type. Make 
> aliasing work.

This seems like an anti-goal.  Maybe you want a `reinterpret_cast` 
between matrix types with the same total storage size, though?  Not sure 
if that’s a good idea.

> TODO: Does it make sense to allow M::element_type, M::rows, and 
> M::columns where M is a matrix type? We don’t support this anywhere 
> else, but it’s convenient. The alternative is using template 
> deduction to extract this information.

C will need a spelling for these, too.

> Future Work: Initialization syntax.
> Future Work: Conversions between matrix types with const qualified and 
> unqualified element types.

Should be defined away by handling qualifiers correctly, per above.

> Standard Conversions
>
> The standard conversions are extended as follows.
>
> For integral promotions, floating-point promotion, integral 
> conversions, floating-point conversions, and floating-integral 
> conversions: apply the rules to the underlying type of the matrix 
> type. The resulting type is a matrix type with that underlying element 
> type. The resulting value is as follows:
>
>  * If the original value was of matrix type, each element is converted 
> element by element.
>  * If the original value was not of matrix type, each element takes 
> the value of the original value.

This isn’t sufficient; you need to describe what happens if the 
element counts don’t match.  I assume this is ill-formed.

> Arithmetic Conversions
>
> The usual arithmetic conversions are extended as follows.
>
> Insert at the start:
>
>  * If either operand is of matrix type, apply the usual arithmetic 
> conversions using its underlying element type. The resulting type is a 
> matrix type with that underlying element type.

This seems like a mistake, mostly because of the integer promotions.  
You really want `short_matrix + short_matrix` to yield an `int_matrix`, 
or `-short_matrix` to be an `int_matrix`?  I would recommend that you 
specify this as working without integer promotion, so that the result of 
an operation two integer operations just has (1) the greater rank and 
(2) is unsigned if there’s a signedness mismatch and the signed type 
can’t express the full range of the unsigned type.

You also need to specify what happens if you pass a matrix as a variadic 
argument.

> Matrix Type Element Access Operator
>
> An expression of the form `postfix-expression 
> [expression][expression]` where the postfix-expression is of matrix 
> type is a matrix element access expression. expression shall not be a 
> comma expression, and shall be a prvalue of unscoped enumeration or 
> integral type. Given the expression E1[E2][E3], the result is an 
> lvalue of the same type as the underlying element type of the matrix 
> that refers to the value at E2 row and E3 column in the matrix. The 
> expression E1 is sequenced before E2 and E3. The expressions E2 and E3 
> are unsequenced.
>
> Note: We thought about providing an expression of the form  
> `postfix-expression [expression]` to access columns of a matrix. We 
> think that such an expression would be problematic once both column 
> and row major matrixes are supported: depending on the memory layout, 
> either accessing columns or rows can be done efficiently, but not 
> both. Instead, we propose to provide builtins to extract rows and 
> columns from a matrix. This makes the operations more explicit.

Okay, so what happens if you *do* write `matrix[0]`?

John.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20200325/f3df9a10/attachment-0001.html>