[LLVMdev] About clock and wait instruction
Vipin Gokhale
vipin.gokhale at comcast.net
Fri Dec 19 03:01:02 PST 2003
Chris Lattner wrote:
> On Fri, 19 Dec 2003, Yueqiang wrote:
>
>
>>In high level languge there are usually have time and sync instruction,
>>to handle async and sync operation.
>>I want to know how LLVM take an count of these.
>
>
> I'm not sure exactly what 'time' and 'sync' operations you're talking
> about, or what languages support them. However, LLVM is designed to make
> it trivial to interface to libraries and routines written in other
> languages. This means that (even if LLVM should be extended to support
> these operations eventually) you can always write them in assembly or
> whatever, then call them from LLVM.
>
Perhaps "clock" is referring to things like reading CPU cycle counter on
most modern processors (asm("rpcc %v0", foo) or __RPCC() compiler
builtin on Alpha, e.g.); in the long term, a candidate for builtins I
suspect.
> Note however that LLVM is _not_ a high-level language, and in fact we do
> not explicitly support _many_ features of HLLs directly in LLVM. We use a
> strategy of representing the key components of the high-level ideas using
> low-level primitives that can be used for a variety of purposes. If you
> describe what the time/sync operations are, maybe I can sketch out a
> suggested mapping for you.
>
While on the subject of builtins/asm etc, most modern CPUs also have
instructions to do memory barriers/fences (i.e. stall the CPU until all
in-flight memory loads and/or stores preceding the fence instruction
have finished e.g., - may be that's what "wait" instruction in the
subject line refers to ?). These are typically implemented as compiler
builtins or asm in C. I do realize that anyone working on running
existing code through LLVM can easily work around the current
asm/builtin implementation for now by calling an assembly function,
however, a perhaps not so obvious implication/intent in a memory fence
like builtin is that the programmer also does not want compiler to
reorder load/store instructions across the barrier. I do not see any
mechanism in LLVM framework to express such a notion of barrier/fence or
a mechanism to indicate that load/stores within what might otherwise
look like a "normal" basic block, must not be reordered). [ (a) Does
LLVM understand 'volatile' attribute in C ? (b) My apologies in advance
if I end up (or already have ?) "highjacking" this thread into another
unrelated topic... ]
May be an example (grossly simplified, but otherwise "real life") will
help :
*old = *a->next_link;
*x = 1; /* set a flag - indicates start of operation */
*a->next_link = *b->next_link;
asm("<store fence instruction>");
*x = 0; /* reset the flag - done */
Here, assume that (1) x, a, b and old are all (non-alias) addresses that
map to a shared memory segment and/or execution environment for this
code is multi-threaded - i.e. there's another thread of execution
(watchdog) that the compiler may not be aware of, to which these memory
writes are "visible", and (2) the watchdog thread is responsible for
"recovering" the data structure manipulation, should the thread doing it
fail for some reason while in this "critical section" code (assume that
the "critical section" in the example above is a bit more "elaborate"
than just one memory update of the linked list pointer). It is important
in this case that despite what a simple dataflow analysis might
otherwise indicate, the compiler/optimizer must not zap *x = 1 as a case
of redundant store operation. Another item that falls in this general
category is code that uses setjmp/longjmp :
foo() {
int x;
x = 0;
.....
if ((setjmp(...) == 0) {
....
x = 1;
....
/* assume somewhere deep down the call chain from here,
there's a longjmp */
.....
} else {
if (x == 1) {
.....
}
}
In the example above, if compiler doesn't understand the special
semantics of setjmp, there's a potential for if (x == 1) block to get
optimized incorrectly (x being a local variable, and setjmp being just
another "ordinary" function call which is not taking address of x as a
parameter, if control reaches the else block of outer if statement, data
flow analysis can easily prove that value of x has to be 0 and the if
block becomes dead code...). I must admit I'm not fully up to speed on
LLVM yet, and perhaps setjmp does get a special treatment in LLVM (ISTM
C++ try/catch blocks do get a special treatment; not sure about
setjmp/longjmp)....
In "traditional" (one-C-file-at-a-time) compiler/optimizers, one can
workaround this by taking address of x and passing it as a parameter to
a null external function to keep the compiler from doing unsafe
optimizations even when setjmp/longjmp is not treated any special. My
concern when one is dealing with a whole-program optimizer
infrastructure like LLVM (or for that matter post-link optimizer like
Spike from DEC/Compaq which works off of fully linked binary and naked
machine instructions) has been that it can easily (atleast in theory)
see through this call-a-null-function trick... Yet, one could argue that
there're plenty of legitimate optimization opportunities where memory
references can be reordered, squashed, hoisted across basic blocks or
even function calls (IOW turning off certain aggressive optimizations
altogether might be a sledgehammer approach). I'm geting this nagging
feeling that there may need to be a mechanism where special annotations
need to be placed in LLVM instruction stream to ensure safe
optimizations.... Someone please tell me my concerns are totally
unfounded, atleast for LLVM :-)
- Vipin
More information about the llvm-dev
mailing list