[LLVMdev] About clock and wait instruction

Vipin Gokhale vipin.gokhale at comcast.net
Fri Dec 19 03:01:02 PST 2003


Chris Lattner wrote:

> On Fri, 19 Dec 2003, Yueqiang wrote:
> 
> 
>>In high level languge there are usually have time and sync instruction,
>>to handle async and sync operation.
>>I want to know how LLVM take an count of these.
> 
> 
> I'm not sure exactly what 'time' and 'sync' operations you're talking
> about, or what languages support them.   However, LLVM is designed to make
> it trivial to interface to libraries and routines written in other
> languages.  This means that (even if LLVM should be extended to support
> these operations eventually) you can always write them in assembly or
> whatever, then call them from LLVM.
> 

Perhaps "clock" is referring to things like reading CPU cycle counter on 
most modern processors (asm("rpcc %v0", foo) or __RPCC() compiler 
builtin on Alpha, e.g.); in the long term, a candidate for builtins I 
suspect.

> Note however that LLVM is _not_ a high-level language, and in fact we do
> not explicitly support _many_ features of HLLs directly in LLVM.  We use a
> strategy of representing the key components of the high-level ideas using
> low-level primitives that can be used for a variety of purposes.  If you
> describe what the time/sync operations are, maybe I can sketch out a
> suggested mapping for you.
> 

While on the subject of builtins/asm etc, most modern CPUs also have 
instructions to do memory barriers/fences (i.e. stall the CPU until all 
in-flight memory loads and/or stores preceding the fence instruction 
have finished e.g., - may be that's what "wait" instruction in the 
subject line refers to ?). These are typically implemented as compiler 
builtins or asm in C. I do realize that anyone working on running 
existing code through LLVM can easily work around the current 
asm/builtin implementation for now by calling an assembly function, 
however, a perhaps not so obvious implication/intent in a memory fence 
like builtin is that the programmer also does not want compiler to 
reorder load/store instructions across the barrier. I do not see any 
mechanism in LLVM framework to express such a notion of barrier/fence or 
a mechanism to indicate that load/stores within what might otherwise 
look like a "normal" basic block, must not be reordered). [ (a) Does 
LLVM understand 'volatile' attribute in C ? (b) My apologies in advance 
if I end up (or already have ?) "highjacking" this thread into another 
unrelated topic... ]

May be an example (grossly simplified, but otherwise "real life") will 
help :

     *old = *a->next_link;
     *x = 1;       /* set a flag - indicates start of operation */
     *a->next_link = *b->next_link;

     asm("<store fence instruction>");

     *x = 0;       /* reset the flag - done */

Here, assume that (1) x, a, b and old are all (non-alias) addresses that 
map to a shared memory segment and/or execution environment for this 
code is multi-threaded - i.e. there's another thread of execution 
(watchdog) that the compiler may not be aware of, to which these memory 
writes are "visible", and (2) the watchdog thread is responsible for 
"recovering" the data structure manipulation, should the thread doing it 
fail for some reason while in this "critical section" code (assume that 
the "critical section" in the example above is a bit more "elaborate" 
than just one memory update of the linked list pointer). It is important 
in this case that despite what a simple dataflow analysis might 
otherwise indicate, the compiler/optimizer must not zap *x = 1 as a case 
of redundant store operation. Another item that falls in this general 
category is code that uses setjmp/longjmp :

foo() {
    int x;

    x = 0;

    .....

    if ((setjmp(...) == 0) {

          ....

       x = 1;

          ....

       /* assume somewhere deep down the call chain from here,
          there's a longjmp  */

          .....

     } else {
       if (x == 1) {
           .....
       }
     }

In the example above, if compiler doesn't understand the special 
semantics of setjmp, there's a potential for if (x == 1) block to get 
optimized incorrectly (x being a local variable, and setjmp being just 
another "ordinary" function call which is not taking address of x as a 
parameter, if control reaches the else block of outer if statement, data 
flow analysis can easily prove that value of x has to be 0 and the if 
block becomes dead code...). I must admit I'm not fully up to speed on 
LLVM yet, and perhaps setjmp does get a special treatment in LLVM (ISTM 
C++ try/catch blocks do get a special treatment; not sure about 
setjmp/longjmp)....

In "traditional" (one-C-file-at-a-time) compiler/optimizers, one can 
workaround this by taking address of x and passing it as a parameter to 
a null external function to keep the compiler from doing unsafe 
optimizations even when setjmp/longjmp is not treated any special. My 
concern when one is dealing with a whole-program optimizer 
infrastructure like LLVM (or for that matter post-link optimizer like 
Spike from DEC/Compaq which works off of fully linked binary and naked 
machine instructions) has been that it can easily (atleast in theory) 
see through this call-a-null-function trick... Yet, one could argue that 
there're plenty of legitimate optimization opportunities where memory 
references can be reordered, squashed, hoisted across basic blocks or 
even function calls (IOW turning off certain aggressive optimizations 
altogether might be a sledgehammer approach). I'm geting this nagging 
feeling that there may need to be a mechanism where special annotations 
need to be placed in LLVM instruction stream to ensure safe 
optimizations.... Someone please tell me my concerns are totally 
unfounded, atleast for LLVM :-)

- Vipin





More information about the llvm-dev mailing list