Ping 3: Add a LOAD_SEQUENCE_POINT ISDOpcode
Richard Sandiford
rsandifo at linux.vnet.ibm.com
Wed Dec 4 02:11:09 PST 2013
Third ping for:
http://llvm-reviews.chandlerc.com/D2171
Thanks,
Richard
One unusual feature of the z architecture is that the result of a
previous load can be reused indefinitely for subsequent loads, even if a
cache-coherent store to that location is performed by another CPU. A
special serialising instruction must be used if you want to force a load
to be reattempted. To quote the architecture manual (where MVI is MOVE
IMMEDIATE and CLI is COMPARE LOGICAL IMMEDIATE):
Following is an example showing the effects of serialization. Location
A initially contains FF hex.
CPU 1 CPU 2
MVI A,X'00' G CLI A,X'00'
BCR 15,0 BNE G
The BCR 15,0 instruction executed by CPU 1 is a serializing
instruction that ensures that the store by CPU 1 at location A is
completed. However, CPU 2 may loop indefinitely, or until the next
interruption on CPU 2, because CPU 2 may already have fetched from
location A for every execution of the CLI instruction. A serializing
instruction must be in the CPU-2 loop to ensure that CPU 2 will again
fetch from location A.
Since volatile loads are not supposed to be omitted in this way, we
should insert a serialising instruction before each such load. The same
goes for atomic loads.
This patch adds a new ISDOpcode for this situation. It's emitted in a
similar way to ATOMIC_FENCE, but in different circumstances and for
different reasons:
/// Marks a point before a volatile or atomic load, to ensure that
/// subsequent loads are attempted. This exists for architectures
/// like SystemZ that allow results from previous loads to be reused
/// indefinitely. For example, the architecture may treat a loop:
///
/// while (*i == 0);
///
/// as:
///
/// while (*i == 0) spin-until-interrupt;
///
/// omitting all but the first load in each time slice (even if a
/// cache-coherent store is performed by another CPU). Inserting
/// this operation forces each iteration of the loop to attempt a load.
///
/// Note that this is not an ordering fence per se. It simply prevents
/// the processor from collapsing a sequence of N loads into 1 load at
/// run time.
LOAD_SEQUENCE_POINT,
I'm certainly open to better names than LOAD_SEQUENCE_POINT though.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D2171.diff
Type: text/x-patch
Size: 26024 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20131204/0ce8180a/attachment.bin>
More information about the llvm-commits
mailing list