Ping 3: Add a LOAD_SEQUENCE_POINT ISDOpcode

Wed Dec 4 02:11:09 PST 2013

Third ping for:

  http://llvm-reviews.chandlerc.com/D2171

Thanks,
Richard

One unusual feature of the z architecture is that the result of a
previous load can be reused indefinitely for subsequent loads, even if a
cache-coherent store to that location is performed by another CPU. A
special serialising instruction must be used if you want to force a load
to be reattempted. To quote the architecture manual (where MVI is MOVE
IMMEDIATE and CLI is COMPARE LOGICAL IMMEDIATE):

  Following is an example showing the effects of serialization. Location
  A initially contains FF hex.

  CPU 1                  CPU 2
  MVI A,X'00'       G    CLI A,X'00'
  BCR 15,0               BNE G

  The BCR 15,0 instruction executed by CPU 1 is a serializing
  instruction that ensures that the store by CPU 1 at location A is
  completed. However, CPU 2 may loop indefinitely, or until the next
  interruption on CPU 2, because CPU 2 may already have fetched from
  location A for every execution of the CLI instruction. A serializing
  instruction must be in the CPU-2 loop to ensure that CPU 2 will again
  fetch from location A.

Since volatile loads are not supposed to be omitted in this way, we
should insert a serialising instruction before each such load. The same
goes for atomic loads.

This patch adds a new ISDOpcode for this situation. It's emitted in a
similar way to ATOMIC_FENCE, but in different circumstances and for
different reasons:

  /// Marks a point before a volatile or atomic load, to ensure that
  /// subsequent loads are attempted.  This exists for architectures
  /// like SystemZ that allow results from previous loads to be reused
  /// indefinitely.  For example, the architecture may treat a loop:
  ///
  ///   while (*i == 0);
  ///
  /// as:
  ///
  ///   while (*i == 0) spin-until-interrupt;
  ///
  /// omitting all but the first load in each time slice (even if a
  /// cache-coherent store is performed by another CPU).  Inserting
  /// this operation forces each iteration of the loop to attempt a load.
  ///
  /// Note that this is not an ordering fence per se. It simply prevents
  /// the processor from collapsing a sequence of N loads into 1 load at
  /// run time.
  LOAD_SEQUENCE_POINT,

I'm certainly open to better names than LOAD_SEQUENCE_POINT though.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D2171.diff
Type: text/x-patch
Size: 26024 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20131204/0ce8180a/attachment.bin>