[PATCH] Add a LOAD_SEQUENCE_POINT ISDOpcode

Richard Sandiford rsandifo at linux.vnet.ibm.com
Thu Nov 14 01:54:57 PST 2013


Tom Stellard <tom at stellard.net> writes:
> On Wed, Nov 13, 2013 at 09:19:17AM -0800, Richard Sandiford wrote:
>> One unusual feature of the z architecture is that the result of a
> previous load can be reused indefinitely for subsequent loads, even if a
> cache-coherent store to that location is performed by another CPU.  A
> special serialising instruction must be used if you want to force a load
> to be reattempted.  To quote the architecture manual (where MVI is MOVE
> IMMEDIATE and CLI is COMPARE LOGICAL IMMEDIATE):
>
> We have a very similar 'feature' for the VLIW targets in the R600 backend.
> The result of a load from LDS (Local memory in OpenCL) is stored in the
> 'output queue'.  When an ALU instructions wants to use the result of a
> load, it had two options: 1. It can read the value from the top of the
> queue and then pop it off.  2. It can read the value and leave it on
> the queue.  If instructions use option 2, then the result of the load
> can be used indefinitely.

Ah, sounds like that might make the assembly pretty difficult to read :-)

The added complication for z is that it isn't defined whether and when
the reuse occurs, a bit like memory ordering isn't defined on weakly-ordered
machines (which z isn't).

>>     /// Marks a point before a volatile or atomic load, to ensure that
>>     /// subsequent loads are attempted.  This exists for architectures
>>     /// like SystemZ that allow results from previous loads to be reused
>>     /// indefinitely.  For example, the architecture may treat a loop:
>>     ///
>>     ///   while (*i == 0);
>>     ///
>>     /// as:
>>     ///
>>     ///   while (*i == 0) spin-until-interrupt;
>>     ///
>>     /// omitting all but the first load in each time slice (even if a
>>     /// cache-coherent store is performed by another CPU).  Inserting
>>     /// this operation forces each iteration of the loop to attempt a load.
>>     ///
>>     /// Note that this is not an ordering fence per se.  It simply ensures
>>     /// that a sequence of N loads is not collapsed into 1 load.
>
> What would cause N loads to be collapsed into 1 load?  Is this something
> the Legalizer might do?

Sorry, bad use of the passive tense there.  I meant that the processor
can collapse N loads to 1 load at run time unless we use these sequence
points to stop it.  I've just changed the comment to:

    /// Marks a point before a volatile or atomic load, to ensure that
    /// subsequent loads are attempted.  This exists for architectures
    /// like SystemZ that allow results from previous loads to be reused
    /// indefinitely.  For example, the architecture may treat a loop:
    ///
    ///   while (*i == 0);
    ///
    /// as:
    ///
    ///   while (*i == 0) spin-until-interrupt;
    ///
    /// omitting all but the first load in each time slice (even if a
    /// cache-coherent store is performed by another CPU).  Inserting
    /// this operation forces each iteration of the loop to attempt a load.
    ///
    /// Note that this is not an ordering fence per se.  It simply prevents
    /// the processor from collapsing a sequence of N loads into 1 load at
    /// run time.
    LOAD_SEQUENCE_POINT,

Thanks,
Richard




More information about the llvm-commits mailing list