[LLVMdev] n-bit bytes for clang/llvm

Wed Mar 18 03:59:06 PDT 2015

Hi Patrik

Indeed I am hoping to avoid the n-bitian-fork approach (laziness more than
anything; the pain of keeping patches moving forwards with the clang/llvm
mainstream) And luckily for a toy architecture legacy code compatibility is
less of a concern, at least until I sleepwalk into the "port Linux" state...

Tyro

On Wed, Mar 18, 2015 at 11:25 AM, Tyro Software <softwaretyro at gmail.com>
wrote:

> Thanks - that's a really helpful steer.
>
> So if I'm understanding correctly, the CHERI address spaces are equivalent
> as regards actual memory addresses, with the "fatness" being the type,
> access, etc metadata? (somehow I'd formed the impression that LLVM address
> spaces needed to be disjoint)
>
> Tyro
>
> On Wed, Mar 18, 2015 at 8:31 AM, David Chisnall <
> David.Chisnall at cl.cam.ac.uk> wrote:
>
>> On 17 Mar 2015, at 13:11, Tyro Software <softwaretyro at gmail.com> wrote:
>> >
>> > As an alternative to fixing the "char == 8 bits" presumption would
>> using non-uniform pointer types have been another possible approach, e.g.
>> keep char as 8 bit but have char* encode both the word address and the byte
>> location within it (i.e. one extra bit in this 16-bit case). Of course this
>> is only a less intrusive (to LLVM) approach if LLVM readily supports such
>> pointers, which may be close to asking "could 8086 small/large/huge
>> pointers be implemented?"
>> >
>> > One obvious drawback to such an approach is that dereferencing char*
>> becomes relatively expensive, though for the sort of code being
>> predominantly run on a DSP that might be acceptable.
>>
>> We're using multiple address spaces to describe two pointer
>> representations for CHERI: AS0 is a 64-bit pointer that's represented as an
>> integer, AS200 is a capability (256-bit fat pointer with base, length,
>> permissions, enforced in hardware).  We had to fix a few things where LLVM
>> assumes that pointers are integers, but the different size pointers in
>> different address spaces part works very well.  The biggest weakness is in
>> TableGen / SelectionDAG, where you can't write patterns on iPTR that depend
>> on a specific AS (actually, you can't really write patterns on iPTR at all,
>> as LLVM tries to lower iPTR to some integer type first, even when this
>> doesn't make any sense [e.g. on an architecture with separate address and
>> integer registers]).
>>
>> Having AS0 be a byte pointer, which the back end would lower to two
>> words, and some target-specific AS be a word pointer would likely work
>> quite well.
>>
>> David
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150318/ad3c1b1c/attachment.html>