[LLVMdev] RFC: GEP as canonical form for pointer addressing

David Chisnall David.Chisnall at cl.cam.ac.uk
Tue Feb 25 00:16:23 PST 2014


On 24 Feb 2014, at 23:19, Philip Reames <listmail at philipreames.com> wrote:

> On 02/20/2014 01:02 AM, David Chisnall wrote:
>> We have managed to get LLVM working (and building nontrivial amounts of code) on a MIPS-derived architecture that has non-integer pointers, and the representation in the IR itself is fine.  We have a few hacks in optimisations that are far too coarse grained (i.e. don't do this optimisation if you're dealing with this kind of pointer, even though many of them [SCEV in particular] should work but the code makes invalid assumptions).  We do end up having to add more after every merge.
> Any chance you'd be willing to share patches?  Or even just a list of optimizations effected?  This is work I'm likely be duplicating in the very near future.
>> We start to hit problems when we get to SelectionDAG, which makes a lot of assumptions about the underlying architecture and has an annoying habit of thinking it knows better than the back end and undoing transformations that the back end has done.
> We looked at trying to preserve pointer vs integer information post SelectionDAG and quickly gave up.  I believe this to be the right long term direction - i.e. years from now - but we didn't believe it would be viable in the near term.  Instead, we've chosen to encode the information we actually need - which values to rewrite - at an earlier phase and construct the IR such that - we hope - nothing can insert uses after our insert safepoints.
> 
> The fact you've gotten this working all the way though is an impressive accomplishment and gives me hope for the long term direction.

Our LLVM and Clang repositories are here:

https://github.com/CTSRD-CHERI/llvm
https://github.com/CTSRD-CHERI/clang

(the cheri branch in both is currently the active one, but it will be renamed head soon - we've just made some changes to our ISA and are waiting for everyone to have the updated version before we sync everything).  I'm in the process of cleaning up the MIPS IV support for upstreaming, and I'm happy to upstream anything else that is more generally useful.  I haven't yet for two reasons:

- Without an architecture that has a pointer-integer distinction in tree (and a lot of tests!), the support will likely bit rot.  Our architecture does not yet have a stable ISA (and, since it's a research platform, probably never will), so would not be a good choice.

- Lots of things have '//FIXME: This is a really ugly hack!' above them and, while they do make things work for us, they are not always the right approach for long-term support.  

I would be interested in working with anyone who wants to get better support for this kind of architecture into the architecture-neutral parts of LLVM.  Having the address space cast instruction has simplified things for us quite a bit (we still actually lower this to an inttoptr or ptrtoint in the back end, but at least the optimisers don't randomly elide the casts or break things anymore, because they assume that ptrtoint -> inttoptr is a bitcast, even if they ended up in different address spaces [which had different sizes]). 

David





More information about the llvm-dev mailing list