[llvm-dev] Load combine pass
Sanjoy Das via llvm-dev
llvm-dev at lists.llvm.org
Thu Sep 29 11:16:59 PDT 2016
Hi Artur,
Artur Pilipenko wrote:
>> On 29 Sep 2016, at 21:01, Sanjoy Das<sanjoy at playingwithpointers.com>
wrote:
>>
>> Hi Artur,
>>
>> Artur Pilipenko wrote:
>>
>>> BTW, do we really need to emit an atomic load if all the individual
>>> components are bytes?
>> Depends -- do you mean at the at the hardware level or at the IR
>> level?
>>
>> If you mean at the IR level, then I think yes; since otherwise it is
>> legal to do transforms that break byte-wise atomicity in the IR, e.g.:
>>
>> i32* ptr = ...
>> i32 val = *ptr
>>
>> => // Since no threads can be legally racing on *ptr
>>
>> i32* ptr = ...
>> i32 val0 = *ptr
>> i32 val1 = *ptr
>> i32 val = (val0& 1) | (val1& ~1);
>>
>>
>> If you're talking about the hardware level, then I'm not sure; and my
>> guess is that the answer is almost certainly arch-dependent.
> I meant the case when we have a load by bytes pattern like this:
> i8* p = ...
> i8 b0 = *p++;
> i8 b1 = *p++;
> i8 b2 = *p++;
> i8 b3 = *p++;
> i32 result = b0<< 24 | b1<< 16 | b2<< 8 | b<< 0;
>
> When we fold it to a i32 load, should this load be atomic?
If we do fold it to a non-atomic i32 load, then it would be legal for
LLVM to do the IR transform I mentioned above. That breaks the
byte-wise atomicity you had in the original program.
That is, in:
i8* p = ...
i8 b0 = *p++;
i8 b1 = *p++;
i8 b2 = *p++;
i8 b3 = *p++;
// Note: I changed this to be little endian, and I've assumed
// that we're compiling for a little endian system
i32 result = b3<< 24 | b2<< 16 | b1<< 8 | b0<< 0;
say all of p[0..3] are 0, and you have a thread racing to set b0 to
-1. Then result can either be 0 or 255.
However, say you first transform this to a non-atomic i32 load:
i8* p = ...
i32* p.i32 = (i32*)p
i32 result = *p.i32
and we do the transform above
i8* p = ...
i32* p.i32 = (i32*)p
i32 result0 = *p.i32
i32 result1 = *p.i32
i32 result = (result0 & 1) | (result1 & ~1);
then it is possible for result to be 254 (by result0 observing 0 and
result observing 255).
-- Sanjoy
More information about the llvm-dev
mailing list