[llvm-dev] [RFC][SVE] Supporting Scalable Vector Architectures in LLVM IR (take 2)
Graham Hunter via llvm-dev
llvm-dev at lists.llvm.org
Fri Jul 7 05:56:59 PDT 2017
Yes, OpenMP simd pragmas work quite well with sve, and there's a work-in-progress vector ABI for 'declare simd' pragmas to vectorize whole functions.
We also have a tentative ACLE specification, to allow the use of sve via C-level intrinsics. The specification is here: http://infocenter.arm.com/help/topic/com.arm.doc.ecm0665619/acle_sve_100987_0000_00_en.pdf
There's a couple of examples here: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.100891_0607_00_en/wap1490203634804.html
> On 7 Jul 2017, at 06:56, Сергей Прейс via llvm-dev <llvm-dev at lists.llvm.org> wrote:
> shouldn't be OpenMP SIMD natural fit for SVE? If so special attention is to be paid to simd functions (#pragma omp declare simd) which allows passing and returning vector values.
> 07.07.2017, 10:02, "Eric Christopher via llvm-dev" <llvm-dev at lists.llvm.org>:
>> Hi Amara,
>> I was wondering if you have a link to a suggested programming model in mind for SVE and how it'll interact with other vector operations?
>> On Thu, Jul 6, 2017 at 3:53 PM Amara Emerson via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>>> On 6 July 2017 at 23:13, Chris Lattner <clattner at nondot.org> wrote:
>>>>> Yes, as an extension to VectorType they can be manipulated and passed
>>>>> around like normal vectors, load/stored directly, phis, put in llvm
>>>>> structs etc. Address computation generates expressions in terms vscale
>>>>> and it seems to work well.
>>>> Right, that works out through composition, but what does it mean? I can't have a global variable of a scalable vector type, nor does it make sense for a scalable vector to be embeddable in an LLVM IR struct: nothing that measures the size of a struct is prepared to deal with a non-constant answer.
>>> Although the absolute size of the types aren't known at compile time,
>>> there are upper bounds which the compiler can assume and use to allow
>>> allocation of storage for global variables and the like. The issue
>>> with composite type sizes again reduce to the issue of type sizes
>>> being either symbolic expressions or simply unknown in some cases.
>>>>>> This should probably be an intrinsic, not an llvm::Constant. The design of llvm::Constant is already wrong: it shouldn’t have operations like divide, and it would be better to not contribute to the problem.
>>>>> Could you explain your position more on this? The Constant
>>>>> architecture has been a very natural fit for this concept from our
>>>> It is appealing, but it is wrong. Constant should really only model primitive constants (ConstantInt/FP, etc) and we should have one more form for “relocatable” constants. Instead, we have intertwined constant folding and ConstantExpr logic that doesn’t make sense.
>>>> A better pattern to follow are intrinsics like (e.g.) llvm.coro.size.i32(), which always returns a constant value.
>>> Ok, we'll investigate this issue further.
>>>>>> Ok, that sounds complicated, but can surely be made to work. The bigger problem is that there are various LLVM IR transformations that want to put registers into memory. All of these will be broken with this sort of type.
>>>>> Could you give an example?
>>>> The concept of “reg2mem” is to put SSA values into allocas for passes that can’t (or don’t want to) update SSA. Similarly, function body extraction can turn SSA values into parameters, and depending on the implementation can pack them into structs. The coroutine logic similarly needs to store registers if they cross suspend points, there are multiple other examples.
>>> I think this should still work. Allocas of scalable vectors are supported,
>>> and it's only later at codegen that the unknown sizes result in more
>>> work being needed to compute stack offsets correctly. The caveat being
>>> that a direct call to something like getTypeStoreSize() will need to
>>> be aware of expressions/sizeless-types. If however these passes are
>>> exclusively using allocas to put registers into memory, or using
>>> structs with extractvalue etc, then they shouldn't need to care and
>>> codegen deals with the low level details.
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
More information about the llvm-dev