Ping 3: Add a LOAD_SEQUENCE_POINT ISDOpcode

Fri Dec 6 01:24:42 PST 2013

Evan Cheng <evan.cheng at apple.com> writes:
> On Dec 5, 2013, at 3:07 AM, Richard Sandiford
> <rsandifo at linux.vnet.ibm.com> wrote:
>
>> Hi Evan,
>> 
>> Thanks for the review.
>> 
>> Evan Cheng <evan.cheng at apple.com> writes:
>>> I'm not very comfortable with adding a generic opcode. Can't you use a
>>> target specific intrinsic?
>> 
>> If the opcode's out, is it still OK to have a TargetLowering hook that
>> gets called before atomic and volatile loads, as in the patch below?
>> This replaces the getInsertLoadSequencePoints() function from the first
>> patch but should be a bit more general (and hopefully a bit more intuitive,
>> since it's named after the situations in which it's used).
>
> Ideally I'd like to see the code for dealing with these loads in the
> target specific isel lowering code. Until we have other targets needing
> similar features, it's a bit difficult to understand whether a opcode
> and / or target hooks are needed. Do you think that's possible?

I think doing it in LowerOperation() means that either (a) we just
insert a serialiasing instruction before the load or (b) we also replace
the load with a different representation, to indicate that it's properly
protected by a serialising instruction.

The problem with (a) is that the meaning of a volatile load node then
depends on whether we're looking at it before or after lowering.  It seems
bad to have that kind of phase-dependent interpretation.

The problem with (b) is that, once protected, the load really is just
a normal load.  The architecture has many register-memory instructions
and I think it would be quite difficult to go through and make sure
that everything that handles memory operands is also able to handle the
lowered form.

> If not, then target hook might be acceptable. But I haven't had the time
> to think this through or review your patch fully.

Thanks.  FWIW, although the patch is quite long, only the first four hunks
are target-independent:

diff --git a/include/llvm/Target/TargetLowering.h b/include/llvm/Target/TargetLowering.h
index 6f643c3..7bc3a52 100644
--- a/include/llvm/Target/TargetLowering.h
+++ b/include/llvm/Target/TargetLowering.h
@@ -2082,6 +2082,18 @@ public:
     return NULL;
   }
 
+  /// This callback is used to prepare for a volatile or atomic load.
+  /// It takes a chain node as input and returns the chain for the load itself.
+  ///
+  /// Having a callback like this is necessary for targets like SystemZ,
+  /// which allows a CPU to reuse the result of a previous load indefinitely,
+  /// even if a cache-coherent store is performed by another CPU.  The defalut
+  /// implementation does nothing.
+  virtual SDValue prepareVolatileOrAtomicLoad(SDValue Chain, SDLoc DL,
+                                              SelectionDAG &DAG) const {
+    return Chain;
+  }
+
   /// This callback is invoked by the type legalizer to legalize nodes with an
   /// illegal operand type but legal result types.  It replaces the
   /// LowerOperation callback in the type Legalizer.  The reason we can not do
diff --git a/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp b/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
index b1d247e..071781a 100644
--- a/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+++ b/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
@@ -3400,7 +3400,7 @@ void SelectionDAGBuilder::visitLoad(const LoadInst &I) {
 
   SDValue Root;
   bool ConstantMemory = false;
-  if (I.isVolatile() || NumValues > MaxParallelChains)
+  if (isVolatile || NumValues > MaxParallelChains)
     // Serialize volatile loads with other side effects.
     Root = getRoot();
   else if (AA->pointsToConstantMemory(
@@ -3413,6 +3413,10 @@ void SelectionDAGBuilder::visitLoad(const LoadInst &I) {
     Root = DAG.getRoot();
   }
 
+  const TargetLowering *TLI = TM.getTargetLowering();
+  if (isVolatile)
+    Root = TLI->prepareVolatileOrAtomicLoad(Root, getCurSDLoc(), DAG);
+
   SmallVector<SDValue, 4> Values(NumValues);
   SmallVector<SDValue, 4> Chains(std::min(unsigned(MaxParallelChains),
                                           NumValues));
@@ -3637,6 +3641,7 @@ void SelectionDAGBuilder::visitAtomicLoad(const LoadInst &I) {
   if (I.getAlignment() < VT.getSizeInBits() / 8)
     report_fatal_error("Cannot generate unaligned atomic load");
 
+  InChain = TLI->prepareVolatileOrAtomicLoad(InChain, dl, DAG);
   SDValue L =
     DAG.getAtomic(ISD::ATOMIC_LOAD, dl, VT, VT, InChain,
                   getValue(I.getPointerOperand()),

The rest are all local to the target.

Thanks,
Richard