[LLVMdev] Load serialisation during selection DAG building

Tue Aug 14 14:05:11 PDT 2012

Further to my earlier question, I'm perhaps a bit confused about memory serialisation. The following example, compiled using clang for the MSP430:

target datalayout = "e-p:16:16:16-i8:8:8-i16:16:16-i32:16:32-n8:16"
target triple = "msp430-??-??"

@y = common global i16 0, align 2
@x = common global i16 0, align 2

define void @f() nounwind {
entry:
  %0 = load i16* @y, align 2, !tbaa !0
  %1 = load volatile i16* @x, align 2, !tbaa !0
  %add = add i16 %1, %0
  store i16 %add, i16* @y, align 2, !tbaa !0
  ret void
}

has a chain store->load volatile->load. I thought this meant that the load volatile had to occur _after_ the load but the MSP430 backend selects the ADD16mm instruction for which I suspect the order of operand access isn't specified. So, does the chain mean "no earlier than" rather than "later than"?

On 14 Aug 2012, at 16:43, Steve Montgomery wrote:

> I looked into those patches but I don't think they will help in my situation because my problems occur during instruction selection rather than scheduling.
> 
> A simple and concrete example is a pattern like:
> 
>     [(set GR:$dst (add GR:$src (nvload addr:$mem)))]
> 
> where nvload matches a load provided that isVolatile() is false.
> 
> If the selection DAG looks like:
> 
>      |      |
>     LD1    LD2
>      ^      ^
>      |      |
>      \      /
>        add
>         ^
>         |
>         \  /
>          ST
>                  
> and the chain like:
> 
>     LD1    LD2
>      ^      ^
>      |      |
>      \      /
>     TokenFactor
>         ^
>         |
>         ST
>      
> then the add instruction is selected. However, if operand 1 of the add is a volatile load, then the chain looks like:
> 
>     LD1
>      ^ 
>      |
>     LD2Volatile
>      |
>      ST
> 
> In this case the add instruction cannot be selected. The operator is commutative, so instruction selection matches the non-volatile load against the operand 1 but then fails because to select this instruction would mean that the volatile load into a register (to match operand 0) would occur before the non-volatile load that's folded into the instruction.
> 
> I can get this instruction to be selected by changing the way loads are built into the selection DAG, i.e. making the chain:
> 
>     LD1    LD2Volatile
>      ^      ^
>      |      |
>      \      /
>     TokenFactor
>         ^
>         |
>         ST
> 
> Given what the LRM says about volatile accesses, this seems valid but I take Hal's point about there being code that might not be expecting such a chain.
> 
> It's a matter of changing a few lines in SelectionDAGBuilder.cpp to achieve what I want so I think I'll go ahead and see what happens.
> 
> If Hal or anyone else has any further thoughts on the potential problems with doing so then I'd be glad to know.
> 
> Thanks
> 
> Steve Montgomery
> 
> On 13 Aug 2012, at 20:09, Hal Finkel wrote:
> 
>> Steve,
>> 
>> I had created a patch last year that does something similar to what you
>> describe for regular loads and stores using aliasing information. I
>> think that the last message in the thread was:
>> http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20120402/140299.html
>> 
>> This approach has worked for me, but it is not the preferred solution
>> going forward. The preferred solution is to keep the "critical
>> chain" as is (partly because there are places in some of the backend
>> lowering code that assume it exists in its current form), and just
>> update the scheduler to be more intelligent about how it forms the
>> dependency graph. This has since been developed for the new MI
>> scheduling framework by Sergei Larin, and was committed in r156842
>> (last message in the discussion thread was
>> http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20120507/142659.html)
>> 
>> I would recommend trying something like Sergei's solution first, and
>> fall back to trying to play with the critical chain only if that can't
>> or won't work.
>> 
>> -Hal
>> 
>> On Mon, 13 Aug 2012 11:29:18 +0100
>> Steve Montgomery <stephen.montgomery3 at btinternet.com> wrote:
>> 
>>> I've got a question about how SelectionDAGBuilder treats loads.
>>> 
>>> The LLVM Language Reference Manual explicitly states that the order
>>> of volatile operations may be changed relative to non-volatile
>>> operations. However, when the SelectionDAGBuilder in LLVM 3.1
>>> encounters a volatile load, it flushes all pending loads and then
>>> chains the volatile load onto them meaning that the volatile load
>>> must be scheduled after those loads. While this behaviour isn't
>>> wrong, it seems to reduce the scope for efficient instruction
>>> selection.
>>> 
>>> Is there any reason not to modify the behaviour of
>>> SelectionDAGBuilder::visitLoad() so that volatile loads don't cause
>>> SelectionDAGBuilder::getRoot() to be called. Instead, they can be
>>> chained together with the head of the chain being stored in
>>> PendingLoads. Then when something else calls
>>> SelectionDAGBuilder::getRoot(), the chain of volatile loads is
>>> TokenFactored together with the non-volatile loads. I've tried this
>>> out and it seems to do what I want but as I'm fairly inexperienced
>>> with LLVM, I'm not sure whether there's something else preventing
>>> this strategy from working.
>>> 
>>> The reason I noticed this is because I have been developing a
>>> back-end for a target in which some instructions are implemented as
>>> pseudos which will be expanded into a pair of instructions. Each of
>>> the two operands of the pseudo can be a load but as the expansion
>>> accesses the right operand twice, I don't want to match a volatile
>>> load for this operand. For example, in:
>>> 
>>>  %0 = load i16* @y, align 2, !tbaa !0
>>>  %1 = load volatile i16* @x, align 2, !tbaa !0
>>>  %add = add i16 %0, %1 
>>> 
>>> the volatile load is sequenced after the non-volatile load which
>>> means that the non-volatile load can't match the left operand of the
>>> add because this would create a scheduling cycle. This means both
>>> loads are selected as load instructions, resulting in use of an extra
>>> register and an extra instruction. If I change the behaviour of
>>> SelectionDAGBuilder::visitLoad() as described then I get just two
>>> instructions. _______________________________________________ LLVM
>>> Developers mailing list LLVMdev at cs.uiuc.edu
>>> http://llvm.cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>> 
>> 
>> 
>> -- 
>> Hal Finkel
>> Postdoctoral Appointee
>> Leadership Computing Facility
>> Argonne National Laboratory
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120814/d4a78ffc/attachment.html>