[llvm-dev] GEP transformation by InstCombiner

David Chisnall via llvm-dev llvm-dev at lists.llvm.org
Tue Jan 16 02:46:41 PST 2018

> On 15 Jan 2018, at 18:21, Demikhovsky, Elena via llvm-dev <llvm-dev at lists.llvm.org> wrote:
> Hi all,
> I’m working on an out-of-tree target and encountered the following problem:
> InstCombiner “normalizes” GEPs and extends Index operand to the Pointer width.
> It works fine if you can convert pointer to integer for address calculation and I assume that all registered targets do this.
> The target I’m working on has very restricted ISA for the pointer calculation:
> ptr + int,   ptr - int,    ptr - ptr    and   ptr-compare

This sounds very familiar - that’s exactly the set of operations that CHERI supports.

We have a set of out-of-tree patches that extend the data layout to understand that there’s a difference between the size and the range of a pointer (e.g. a 128-bit pointer can store 64-bit addresses + metadata) and fixes for all of the optimisers that we’ve found, and SelectionDAG.  We add explicit PTRADD, INTTOPTR and PTRTOINT nodes in SelectionDAG for architectures where pointer+integer is distinct from integer arithmetic and where inttoptr / ptrtoint are not bitcasts.

> I have full arithmetic set for 32-bit integers, but the Ptr is wider. Extending index to the Ptr width requires full arithmetic support for pointers.
> But, actually, it does not come from C-sources (casting Ptr to int means truncation).

We also have clang patches that ensure that these operations do something meaningful.

> I’d like to add TTI (TargetTransformInfo) to InstCombiner in order to configure the width of GEP indices.
> The current default behavior will be preserved.
> What do you think?

We solved this by adding an f qualifier to the p attribute in our DataLayout to describe pointers that are not integers and use address space 0 to define the range.  This isn’t quite ideal, and we should probably explicitly add the range to pointers that are wider than integers.

Note that InstCombine is not the only place that tries to insert pointer-width GEPs in the optimisation pipeline.  I think that we’ve fixed all of them, but I can’t be entirely sure.

We haven’t upstreamed this, because no in-tree target needs them, but I’d be happy to try to extract the relevant parts and put them up for review if this is generally useful.


More information about the llvm-dev mailing list