[cfe-dev] Code generated for pointer pair -> pointer + size -> pointer pair conversion

Stephan Tolksdorf st at quanttec.com
Wed Jun 19 12:15:38 PDT 2013


On 16.06.13 18:13, Benjamin Kramer wrote:
>
> On 16.06.2013, at 13:59, Stephan Tolksdorf <st at quanttec.com> wrote:
>
>> Hi,
>>
>> clang emits the following x64 code for `vector.data() + vector.size()`
>> (where vector is a std::vector<int32> instance that contains two internal pointers that point to the beginning and the end of an array):
>>
>>   movq	(%rdi), %rcx  // rdi is a pointer to the vector
>>   movq	8(%rdi), %rax
>>   subq	%rcx, %rax
>>   andq	$-4, %rax
>>   addq	%rcx, %rax
>>
>> Is there any way to tell clang in the vector implementation that the array is aligned, so that it could reduce this code to a simple load
>> `movq 8(%rdi), %rax`?
>>
>> This kind of optimization would be helpful for inlined code that converts back and forth between a pointer pair representation and pointer + size representation of an array reference.
>
> The optimizer already knows this. Here's the IR for a pattern like yours:
>
>    %sub.ptr.lhs.cast = ptrtoint i32* %1 to i64
>    %sub.ptr.rhs.cast = ptrtoint i32* %0 to i64
>    %sub.ptr.sub = sub i64 %sub.ptr.lhs.cast, %sub.ptr.rhs.cast
>    %sub.ptr.div = ashr exact i64 %sub.ptr.sub, 2
>    %add.ptr = getelementptr inbounds i32* %0, i64 %sub.ptr.div
>
> The "exact" bit on the ashr is a hint that the shift only shifts out zeros (because the pointers are aligned). The getelementptr gets lowered into adds later in the SelectionDAG, but the "exact" bit is lost by then. It has to assume that the value may be unaligned and inserts the andq $-4.
>
> I see two ways to fix this:
>
> 1. Teach SelectionDAG about "exact" bits. I don't think this is possible with our current infrastructure.
> 2. Teach InstCombine how to fuse ashrs and getelementptr instructions. Not sure how tricky this is.

Thanks for the explanation!

The case where the size of the pointee type equals the type's alignment 
is the simplest and most common case. More generally, an optimizer could 
exploit that the difference of two pointers must be a multiple of the 
pointee type size (even if the array wasn't aligned), since in C/C++ 
only pointers to elements of the same array may be subtracted.

Interestingly, GCC doesn't seem to be able to completely optimize away 
round-trip conversions between the pointer pair and pointer + size 
representations of an array either.

- Stephan





More information about the cfe-dev mailing list