[LLVMdev] [RFC] Add compiler scheduling barriers
Philip Reames
listmail at philipreames.com
Mon Jun 23 17:55:34 PDT 2014
On 06/19/2014 09:35 AM, Yi Kong wrote:
> Hi all,
>
> I'm currently working on implementing ACLE extensions for ARM. There
> are some memory barrier intrinsics, i.e.__dsb and __isb that require
> the compiler not to reorder instructions around their corresponding
> built-in intrinsics(__builtin_arm_dsb, __builtin_arm_isb), including
> non-memory-access instructions.[1] This is currently not possible.
>
> It is sometimes useful to prevent the compiler from reordering
> memory-access instructions as well. The only way to do that in both
> GCC and LLVM is using a in-line assembly hack:
> asm volatile("" ::: "memory")
>
> I propose adding two compiler scheduling barriers intrinsics to LLVM:
> __schedule_barrier_memory and __schedule_barrier_full. The former only
> prevents memory-access instructions reordering around the instruction
> and the latter stops all. So that __isb, for example, can be
> implemented something like:
> inline void __isb() {
> __schedule_barrier_full();
> __builtin_arm_isb();
> __schedule_barrier_full();
> }
Given your examples are in C, I want to ask a clarification question.
Are you proposing adding such intrinsics to the LLVM IR? Or to some
runtime library? If the later, *specifically* which one? Or at the
MachineInst layer?
I'm going to run under the assumption you're using C pseudo code for
IR. If this is not the case, the rest of this will be off base.
I'm not familiar with the exact semantics of an "isb" barrier, but I
think you should look at the existing fence IR instructions. These
restrict memory reorderings in the IR. Depending on the platform, they
may imply hardware barriers, but they always imply compiler barriers.
If all you want is a compiler barrier with the existing fence semantics
w.r.t. reordering, we could consider extending fence with a "compiler
only" (bikeshed needed!) attribute.
If you're describing a new memory ordering for existing fences, that
would seem like a reasonable extension.
I'm not familiar with how we currently handle intrinsics for
architecture specific memory barriers. Can anyone else comment on
that? Is there a way to tag a particular intrinsic function as *also*
being a full fence?
>
> To implement these intrinsics, I think the best method is to add
> target-independent pseudo-instructions with appropriate
> properties(hasSideEffects for memory barrier and isTerminator for full
> barrier) and a pseudo-instruction elimination pass after the
> scheduling pass.
Why would your barrier need to be a basic block terminator? That
doesn't parse for me. Could you explain?
>
> What do people think of this idea?
I'm honestly unclear on what your problem is and what you're trying to
propose. It make take a few rounds of conversation to clarify.
Philip
More information about the llvm-dev
mailing list