[llvm-dev] New intrinsic property IntrOnlyWrite

Philip Reames via llvm-dev llvm-dev at lists.llvm.org
Mon Mar 21 19:29:46 PDT 2016



On 03/21/2016 08:54 AM, Nicolai Hähnle wrote:
> On 19.03.2016 14:47, Philip Reames wrote:
>> I'm generally in support of this change. I haven't looked at the patch
>> yet, but the direction seems worthwhile.
>>
>> Note that we already have a writeonly predicate in a few places in the
>> code (BasicAA being one).  If we do introduce the new intrinsic
>> property, we should refactor all of these places to use the new
>> mechanism.  This could be separate changes, but should be considered
>> required as part of adding the new property.
>
> That makes a lot of sense.
>
>> Another way you might consider slicing the problem is to introduce a
>> WriteOnlyArg property.  When combined with ArgMemOnly, this would more
>> precisely model the pseudo store (right?) and is a better fit for the
>> memset/memcpy use case already handled in BasicAA.
>
> This is unsufficient for our use case. Let me try to explain with an 
> example (also for your later comments). We have an intrinsic
>
> declare void @llvm.amdgcn.buffer.store.format.f32(float, <4 x i32>, 
> i32, i32, i1, i1) #1
>
> that stores a 32-bit value in a memory location that is described by a 
> "buffer descriptor" (the <4 x i32>), an i32 index and an i32 offset. 
> You can think of the buffer descriptor as a "heavy pointer" that 
> contains a base address and can also contain additional information 
> like a buffer size, a stride, and a data format. The given index is 
> multiplied by the stride before being added to the base address; the 
> offset is an additional byte offset. The resulting address is checked 
> against the buffer size (and an out of bounds store is silently 
> discarded, which fits the requirements of APIs like OpenGL). 
> Furthermore, the data being stored may be subject to a format 
> conversion (float to int, etc.). The additional i1 arguments enable 
> provide some cache control.
So, I think you have a fundamental problem here.  By not using pointers, 
you are breaking one of the deepest assumptions in LLVM: that we can 
reason about memory aliasing without having to consider the numeric 
values of non-pointer values.  If a buffer can never *ever* be addressed 
via a normal pointer, then *maybe* we can paper over this, but if you're 
mixing pointer and non-pointer accesses to the same address, you have a 
potentially serious problem.

As a concrete example, let's take the following:
%mem = malloc() <-- returns a noalias pointer
%buf = construct_buf_ref(%mem) <-- assume this is transparent and 
doesn't capture
load %mem
buffer_store(%buf)
load %mem

In this (psuedo code) example, basicaa will use capture tracking to 
*prove* that buffer_store can't have written to %mem.  buffer_store is 
allowed to write to any memory it can address, but because %mem has not 
been captured, buffer_store can not obtain a pointer to it. This could 
lead to use value forwarding the load and producing an incorrect result.

Note: If you treat "construct_buf_ref" as a capturing operation, you can 
paper over this example (and many others).  I'm using it mostly as an 
example of how deep the assumptions about memory, pointers, and 
capturing go.

>
> If you have some advice on how to fit such a memory addressing model 
> into the ArgMemOnly logic that would be appreciated, but from my 
> reading of the code it appears that trying to make ArgMemOnly work 
> with this would be a rather large project which we don't really have 
> the resources to tackle right now.
>
> Why am I pushing for IntrWriteOnly?
>
> I suspect the merely IntrWriteOnly without a functional ArgMemOnly 
> does not provide a lot of opportunity for optimization at the IR level.
I think this is not entirely true.  In particular, it gives us a way to 
model things like memset and memset_16 by combining InstrWriteOnly and 
ArgMemOnly.  It's not quite as powerful as having a writeonly argument 
attribute, but it's better than where we are today.
>
> However, the codegen backend does understand the resulting hardware 
> instructions, which are marked mayLoad = 0, mayStore = 1, 
> hasSideEffects = 0. Having mayLoad = 0 and hasSideEffects = 0 makes a 
> difference for instruction scheduling.
>
> If you try to map the intrinsic as-is, without IntrWriteOnly, onto 
> such hardware instructions, TableGen will (correctly) complain about a 
> mismatch of mayLoad and hasSideEffects.
>
> So I'd like to do a minimal change that will make TableGen happy, but 
> I do not want that change to be an ugly kludge.
I'm not opposed to introducing an InstrWriteOnly property for 
intrinsics.  My concern is that you're expecting it to be something it 
isn't (per capture discussion above.)
>
> I understand your concern about having this interact nicely with 
> BasicAA. Perhaps both an IntrWriteOnlyMem and an IntrWriteOnlyArgMem 
> should be added for orthogonality?
A InstrWriteOnly + ArgMemOnly should equal an InstWriteOnlyArgMem. Thus, 
we shouldn't need separate attributes.  My original comment was about 
introducing a writeonly attribute on the argument of a function.  (i.e. 
a *particular* attribute)  This would allow us to precisely describe 
memcpy for instance.  (src argument is readonly, dest argument is writeonly)
>
> Cheers,
> Nicolai
>
>>
>> Philip
>>
>> p.s. Inline comments below specific to your use case.
>>
>> On 03/18/2016 08:16 PM, Nicolai Hähnle via llvm-dev wrote:
>>> Hi,
>>>
>>> I'd like to draw your attention to http://reviews.llvm.org/D18291, in
>>> which I propose a new intrinsic property for intrinsics that are
>>> lowered to instructions that mayStore, but are neither mayLoad nor
>>> hasSideEffects.
>>>
>>> This is relevant for AMDGPU, where we have store instructions that
>>> don't operate on pointers. The codegen backend understands these
>>> perfectly well as stores, and so we can enable better scheduling
>>> decisions than if we forced these instruction to hasSideEffects.
>> Can you give a bit more detail here?  Example, etc..?
>>>
>>> In a perfect world, we'd be able to model the behavior of these load
>>> and store intrinsics via ReadWriteArgMem, but that would require
>>> massive changes in how LLVM thinks about memory locations and how to
>>> describe them.
>> This comments makes me think you might have a much deeper problem. Let's
>> see an actual example first though; I might just be misreading your 
>> intent.
>>>
>>> This comparatively minor addition allows us to move forward with
>>> decent scheduling in codegen for the time being.
>>>
>>> Cheers,
>>> Nicolai
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>



More information about the llvm-dev mailing list