Ping 3: Add a LOAD_SEQUENCE_POINT ISDOpcode

Evan Cheng evan.cheng at apple.com
Wed Dec 4 10:02:54 PST 2013


I'm not very comfortable with adding a generic opcode. Can't you use a target specific intrinsic?

Evan

On Dec 4, 2013, at 2:11 AM, Richard Sandiford <rsandifo at linux.vnet.ibm.com> wrote:

> Third ping for:
> 
>  http://llvm-reviews.chandlerc.com/D2171
> 
> Thanks,
> Richard
> 
> 
> One unusual feature of the z architecture is that the result of a
> previous load can be reused indefinitely for subsequent loads, even if a
> cache-coherent store to that location is performed by another CPU. A
> special serialising instruction must be used if you want to force a load
> to be reattempted. To quote the architecture manual (where MVI is MOVE
> IMMEDIATE and CLI is COMPARE LOGICAL IMMEDIATE):
> 
>  Following is an example showing the effects of serialization. Location
>  A initially contains FF hex.
> 
>  CPU 1                  CPU 2
>  MVI A,X'00'       G    CLI A,X'00'
>  BCR 15,0               BNE G
> 
>  The BCR 15,0 instruction executed by CPU 1 is a serializing
>  instruction that ensures that the store by CPU 1 at location A is
>  completed. However, CPU 2 may loop indefinitely, or until the next
>  interruption on CPU 2, because CPU 2 may already have fetched from
>  location A for every execution of the CLI instruction. A serializing
>  instruction must be in the CPU-2 loop to ensure that CPU 2 will again
>  fetch from location A.
> 
> Since volatile loads are not supposed to be omitted in this way, we
> should insert a serialising instruction before each such load. The same
> goes for atomic loads.
> 
> This patch adds a new ISDOpcode for this situation. It's emitted in a
> similar way to ATOMIC_FENCE, but in different circumstances and for
> different reasons:
> 
>  /// Marks a point before a volatile or atomic load, to ensure that
>  /// subsequent loads are attempted.  This exists for architectures
>  /// like SystemZ that allow results from previous loads to be reused
>  /// indefinitely.  For example, the architecture may treat a loop:
>  ///
>  ///   while (*i == 0);
>  ///
>  /// as:
>  ///
>  ///   while (*i == 0) spin-until-interrupt;
>  ///
>  /// omitting all but the first load in each time slice (even if a
>  /// cache-coherent store is performed by another CPU).  Inserting
>  /// this operation forces each iteration of the loop to attempt a load.
>  ///
>  /// Note that this is not an ordering fence per se. It simply prevents
>  /// the processor from collapsing a sequence of N loads into 1 load at
>  /// run time.
>  LOAD_SEQUENCE_POINT,
> 
> I'm certainly open to better names than LOAD_SEQUENCE_POINT though.
> 
> <D2171.diff>_______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits




More information about the llvm-commits mailing list