Ping 3: Add a LOAD_SEQUENCE_POINT ISDOpcode
Richard Sandiford
rsandifo at linux.vnet.ibm.com
Fri Dec 6 01:24:42 PST 2013
Evan Cheng <evan.cheng at apple.com> writes:
> On Dec 5, 2013, at 3:07 AM, Richard Sandiford
> <rsandifo at linux.vnet.ibm.com> wrote:
>
>> Hi Evan,
>>
>> Thanks for the review.
>>
>> Evan Cheng <evan.cheng at apple.com> writes:
>>> I'm not very comfortable with adding a generic opcode. Can't you use a
>>> target specific intrinsic?
>>
>> If the opcode's out, is it still OK to have a TargetLowering hook that
>> gets called before atomic and volatile loads, as in the patch below?
>> This replaces the getInsertLoadSequencePoints() function from the first
>> patch but should be a bit more general (and hopefully a bit more intuitive,
>> since it's named after the situations in which it's used).
>
> Ideally I'd like to see the code for dealing with these loads in the
> target specific isel lowering code. Until we have other targets needing
> similar features, it's a bit difficult to understand whether a opcode
> and / or target hooks are needed. Do you think that's possible?
I think doing it in LowerOperation() means that either (a) we just
insert a serialiasing instruction before the load or (b) we also replace
the load with a different representation, to indicate that it's properly
protected by a serialising instruction.
The problem with (a) is that the meaning of a volatile load node then
depends on whether we're looking at it before or after lowering. It seems
bad to have that kind of phase-dependent interpretation.
The problem with (b) is that, once protected, the load really is just
a normal load. The architecture has many register-memory instructions
and I think it would be quite difficult to go through and make sure
that everything that handles memory operands is also able to handle the
lowered form.
> If not, then target hook might be acceptable. But I haven't had the time
> to think this through or review your patch fully.
Thanks. FWIW, although the patch is quite long, only the first four hunks
are target-independent:
diff --git a/include/llvm/Target/TargetLowering.h b/include/llvm/Target/TargetLowering.h
index 6f643c3..7bc3a52 100644
--- a/include/llvm/Target/TargetLowering.h
+++ b/include/llvm/Target/TargetLowering.h
@@ -2082,6 +2082,18 @@ public:
return NULL;
}
+ /// This callback is used to prepare for a volatile or atomic load.
+ /// It takes a chain node as input and returns the chain for the load itself.
+ ///
+ /// Having a callback like this is necessary for targets like SystemZ,
+ /// which allows a CPU to reuse the result of a previous load indefinitely,
+ /// even if a cache-coherent store is performed by another CPU. The defalut
+ /// implementation does nothing.
+ virtual SDValue prepareVolatileOrAtomicLoad(SDValue Chain, SDLoc DL,
+ SelectionDAG &DAG) const {
+ return Chain;
+ }
+
/// This callback is invoked by the type legalizer to legalize nodes with an
/// illegal operand type but legal result types. It replaces the
/// LowerOperation callback in the type Legalizer. The reason we can not do
diff --git a/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp b/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
index b1d247e..071781a 100644
--- a/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+++ b/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
@@ -3400,7 +3400,7 @@ void SelectionDAGBuilder::visitLoad(const LoadInst &I) {
SDValue Root;
bool ConstantMemory = false;
- if (I.isVolatile() || NumValues > MaxParallelChains)
+ if (isVolatile || NumValues > MaxParallelChains)
// Serialize volatile loads with other side effects.
Root = getRoot();
else if (AA->pointsToConstantMemory(
@@ -3413,6 +3413,10 @@ void SelectionDAGBuilder::visitLoad(const LoadInst &I) {
Root = DAG.getRoot();
}
+ const TargetLowering *TLI = TM.getTargetLowering();
+ if (isVolatile)
+ Root = TLI->prepareVolatileOrAtomicLoad(Root, getCurSDLoc(), DAG);
+
SmallVector<SDValue, 4> Values(NumValues);
SmallVector<SDValue, 4> Chains(std::min(unsigned(MaxParallelChains),
NumValues));
@@ -3637,6 +3641,7 @@ void SelectionDAGBuilder::visitAtomicLoad(const LoadInst &I) {
if (I.getAlignment() < VT.getSizeInBits() / 8)
report_fatal_error("Cannot generate unaligned atomic load");
+ InChain = TLI->prepareVolatileOrAtomicLoad(InChain, dl, DAG);
SDValue L =
DAG.getAtomic(ISD::ATOMIC_LOAD, dl, VT, VT, InChain,
getValue(I.getPointerOperand()),
The rest are all local to the target.
Thanks,
Richard
More information about the llvm-commits
mailing list