[llvm-dev] RFC: Multiple program address spaces

Paulo Matos via llvm-dev llvm-dev at lists.llvm.org
Wed Nov 18 02:57:44 PST 2020


Apologies everyone for my initial email address mistake. Here's the
original RFC email. Thanks to Thomas for pointing out my error and
forwarding it to the correct address.

Hello all,

TL;DR; The current design for the implementation of reference types in
the WebAssembly backend requires the use of multiple program address
spaces. We propose an implementation of multiple address spaces in
D91428 [1] - this is a backwards compatible change.

# Problem

Currently the default program address and the default data address space
is the same, namely AS0. We can at the moment, change the program
address space with P<n> in the data layout string. This allows harvard
architectures to separate code and data into different address spaces.
However, only one program address space is allowed.

At Igalia [2], we are interested in implementing support for reference
types [3] on the WebAssembly backend. After discussions with Thomas
Lively and Andy Wingo, the design we have for reference types involves
having funcrefs and externrefs living in a different address spaces
(non-integral) from normal code/data. Since funcrefs are callable, we
also need to be able to call them. However, as things stand if we use
`P1` in the data layout, normal function calls will cease to work.

# Design Summary

The reference types implementation introduces two (reference) types:
funcrefs and externrefs.
Funcrefs are references to functions that can be called, while
externrefs are opaque. Because of the way they interact with memory they
need to live in a separate address space. However, to call funcrefs
which live in a separate address space, this address space needs to be
marked as a program address space. Since we wish that normal function
calls keep working as well, we need AS0 to be a program address space
too. This is what the solution to this RFC addresses. The Data Layout
string for WebAssembly would therefore contain ni:1-P0-P1.

# Solution

As mentioned in the TL;DR; the proposed implementation is live in D91428
[1]. The patch is small and backwards compatible, and there should be no
visible changes, unless you use multiple program address spaces.

With the patch, you should be able to use multiple Px-Py-... to mark
your address spaces as code address spaces. The first one will be
considered the default program address space. So, the order in which the
Ps show up in the data layout string _matters_.

The program address spaces are kept in a small vector without
duplicates. Therefore P0-P1-P0 is the same as P0-P1, and P0 is the
default address space.


# Refs

[1] https://reviews.llvm.org/D91428
[2] https://www.igalia.com
[3] https://webassembly.github.io/reference-types/core/

Regards,
-- 
Paulo Matos

Thomas Lively writes:

> Fixing llvm-dev at llvm.org to llvm-dev at lists.llvm.org
>
> On Tue, Nov 17, 2020 at 11:27 AM Thomas Lively <tlively at google.com> wrote:
>
>> Thanks for your work on this, Paulo!
>>
>> Here's some more detail about how function pointers work today in the
>> WebAssembly backend and how they differ from the `funcref` Paulo is
>> working on.
>>
>> Today in the WebAssembly backend, both function and data pointers are
>> emitted as
>> WebAssembly `i32` values. Data pointers are indices into the WebAssembly
>> linear
>> memory and function pointers are indices into a separate function table.
>> The
>> linker is responsible for laying out function references in the table and
>> resolving function pointer relocations to the correct table indices.
>>
>> WebAssembly `funcref` values are an alternative mechanism for referencing
>> functions that our LLVM backend does not yet support. Unlike the `i32`
>> table
>> indices used to implement normal function pointers, `funcref` values are
>> opaque
>> and non-integral. `funcref`s cannot be stored into memory and cannot be
>> inspected, but they can be called.
>>
>> We would like to model both table indices and `funcref`s as function
>> pointers in
>> LLVM IR since they are both callable in WebAssembly. Since table indices
>> are
>> numbers that can be inspected, manipulated, and stored into memory, it
>> makes
>> sense to model them as LLVM IR function pointers into a "normal" address
>> space
>> like address space 0. `funcref` values should instead be modeled as LLVM IR
>> function pointers into a non-integral address space. Since we want to be
>> able to
>> use both in a single program, we need the ability to call function
>> pointers of
>> different address spaces in the same LLVM IR module. That means having
>> multiple
>> program address spaces, as described in this RFC and Paulo's corresponding
>> patch.
>>
>> On Tue, Nov 17, 2020 at 9:11 AM Paulo Matos <pmatos at linki.tools> wrote:
>>
>>> Hello all,
>>>
>>> TL;DR; The current design for the implementation of reference types in
>>> the WebAssembly backend requires the use of multiple program address
>>> spaces. We propose an implementation of multiple address spaces in
>>> D91428 [1] - this is a backwards compatible change.
>>>
>>> # Problem
>>>
>>> Currently the default program address and the default data address space
>>> is the same, namely AS0. We can at the moment, change the program
>>> address space with P<n> in the data layout string. This allows harvard
>>> architectures to separate code and data into different address spaces.
>>> However, only one program address space is allowed.
>>>
>>> At Igalia [2], we are interested in implementing support for reference
>>> types [3] on the WebAssembly backend. After discussions with Thomas
>>> Lively and Andy Wingo, the design we have for reference types involves
>>> having funcrefs and externrefs living in a different address spaces
>>> (non-integral) from normal code/data. Since funcrefs are callable, we
>>> also need to be able to call them. However, as things stand if we use
>>> `P1` in the data layout, normal function calls will cease to work.
>>>
>>> # Design Summary
>>>
>>> The reference types implementation introduces two (reference) types:
>>> funcrefs and externrefs.
>>> Funcrefs are references to functions that can be called, while
>>> externrefs are opaque. Because of the way they interact with memory they
>>> need to live in a separate address space. However, to call funcrefs
>>> which live in a separate address space, this address space needs to be
>>> marked as a program address space. Since we wish that normal function
>>> calls keep working as well, we need AS0 to be a program address space
>>> too. This is what the solution to this RFC addresses. The Data Layout
>>> string for WebAssembly would therefore contain ni:1-P0-P1.
>>>
>>> # Solution
>>>
>>> As mentioned in the TL;DR; the proposed implementation is live in D91428
>>> [1]. The patch is small and backwards compatible, and there should be no
>>> visible changes, unless you use multiple program address spaces.
>>>
>>> With the patch, you should be able to use multiple Px-Py-... to mark
>>> your address spaces as code address spaces. The first one will be
>>> considered the default program address space. So, the order in which the
>>> Ps show up in the data layout string _matters_.
>>>
>>> The program address spaces are kept in a small vector without
>>> duplicates. Therefore P0-P1-P0 is the same as P0-P1, and P0 is the
>>> default address space.
>>>
>>>
>>> # Refs
>>>
>>> [1] https://reviews.llvm.org/D91428
>>> [2] https://www.igalia.com
>>> [3] https://webassembly.github.io/reference-types/core/
>>>
>>> Regards,
>>> --
>>> Paulo Matos
>>>
>>


-- 
Paulo Matos


More information about the llvm-dev mailing list