[cfe-dev] Code generated for pointer pair -> pointer + size -> pointer pair conversion

Sun Jun 16 09:13:41 PDT 2013

On 16.06.2013, at 13:59, Stephan Tolksdorf <st at quanttec.com> wrote:

> Hi,
> 
> clang emits the following x64 code for `vector.data() + vector.size()`
> (where vector is a std::vector<int32> instance that contains two internal pointers that point to the beginning and the end of an array):
> 
>  movq	(%rdi), %rcx  // rdi is a pointer to the vector
>  movq	8(%rdi), %rax
>  subq	%rcx, %rax
>  andq	$-4, %rax
>  addq	%rcx, %rax
> 
> Is there any way to tell clang in the vector implementation that the array is aligned, so that it could reduce this code to a simple load
> `movq 8(%rdi), %rax`?
> 
> This kind of optimization would be helpful for inlined code that converts back and forth between a pointer pair representation and pointer + size representation of an array reference.

The optimizer already knows this. Here's the IR for a pattern like yours:

  %sub.ptr.lhs.cast = ptrtoint i32* %1 to i64
  %sub.ptr.rhs.cast = ptrtoint i32* %0 to i64
  %sub.ptr.sub = sub i64 %sub.ptr.lhs.cast, %sub.ptr.rhs.cast
  %sub.ptr.div = ashr exact i64 %sub.ptr.sub, 2
  %add.ptr = getelementptr inbounds i32* %0, i64 %sub.ptr.div

The "exact" bit on the ashr is a hint that the shift only shifts out zeros (because the pointers are aligned). The getelementptr gets lowered into adds later in the SelectionDAG, but the "exact" bit is lost by then. It has to assume that the value may be unaligned and inserts the andq $-4.

I see two ways to fix this:

1. Teach SelectionDAG about "exact" bits. I don't think this is possible with our current infrastructure.
2. Teach InstCombine how to fuse ashrs and getelementptr instructions. Not sure how tricky this is.

- Ben