[llvm] [ExtendLifetimes] Implement llvm.fake.use to extend variable lifetimes (PR #86149)
Stephen Tozer via llvm-commits
llvm-commits at lists.llvm.org
Thu Aug 29 03:15:33 PDT 2024
https://github.com/SLTozer updated https://github.com/llvm/llvm-project/pull/86149
>From 9d3de0e5b851fb104be7e7f19485e3b1518ce9d3 Mon Sep 17 00:00:00 2001
From: Stephen Tozer <stephen.tozer at sony.com>
Date: Thu, 21 Mar 2024 16:21:52 +0000
Subject: [PATCH 01/20] [ExtendLifetimes] Implement llvm.fake.use to extend
variable lifetimes
---
llvm/docs/LangRef.rst | 36 ++++++
llvm/include/llvm/Analysis/PtrUseVisitor.h | 1 +
llvm/include/llvm/CodeGen/ISDOpcodes.h | 5 +
llvm/include/llvm/CodeGen/MachineInstr.h | 2 +
llvm/include/llvm/CodeGen/SelectionDAGISel.h | 1 +
llvm/include/llvm/IR/Intrinsics.td | 3 +
llvm/include/llvm/Support/TargetOpcodes.def | 3 +
llvm/include/llvm/Target/Target.td | 8 ++
llvm/lib/CodeGen/Analysis.cpp | 3 +-
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp | 43 +++++++
llvm/lib/CodeGen/CodeGenPrepare.cpp | 42 ++++++-
.../CodeGen/DeadMachineInstructionElim.cpp | 3 +-
llvm/lib/CodeGen/MachineCSE.cpp | 2 +-
llvm/lib/CodeGen/MachineScheduler.cpp | 3 +-
llvm/lib/CodeGen/SelectionDAG/FastISel.cpp | 3 +
.../SelectionDAG/LegalizeFloatTypes.cpp | 20 +++
.../SelectionDAG/LegalizeIntegerTypes.cpp | 19 +++
llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h | 7 ++
.../SelectionDAG/LegalizeTypesGeneric.cpp | 11 ++
.../SelectionDAG/LegalizeVectorTypes.cpp | 36 ++++++
.../SelectionDAG/SelectionDAGBuilder.cpp | 18 +++
.../SelectionDAG/SelectionDAGDumper.cpp | 2 +
.../CodeGen/SelectionDAG/SelectionDAGISel.cpp | 64 ++++++++++
llvm/lib/IR/Instruction.cpp | 5 +-
llvm/lib/IR/Verifier.cpp | 1 +
llvm/lib/Target/X86/X86FloatingPoint.cpp | 32 +++++
llvm/lib/Transforms/Scalar/SROA.cpp | 30 +++++
llvm/lib/Transforms/Utils/CloneFunction.cpp | 6 +
llvm/lib/Transforms/Utils/Local.cpp | 3 +
.../Utils/PromoteMemoryToRegister.cpp | 3 +-
llvm/test/CodeGen/MIR/X86/fake-use-phi.mir | 95 +++++++++++++++
.../CodeGen/MIR/X86/fake-use-scheduler.mir | 115 ++++++++++++++++++
.../CodeGen/MIR/X86/fake-use-tailcall.mir | 106 ++++++++++++++++
.../CodeGen/MIR/X86/fake-use-zero-length.ll | 40 ++++++
llvm/test/CodeGen/X86/fake-use-hpfloat.ll | 17 +++
llvm/test/CodeGen/X86/fake-use-ld.ll | 51 ++++++++
.../CodeGen/X86/fake-use-simple-tail-call.ll | 33 +++++
llvm/test/CodeGen/X86/fake-use-split-ret.ll | 53 ++++++++
llvm/test/CodeGen/X86/fake-use-sroa.ll | 54 ++++++++
.../CodeGen/X86/fake-use-suppress-load.ll | 23 ++++
llvm/test/CodeGen/X86/fake-use-tailcall.ll | 23 ++++
llvm/test/CodeGen/X86/fake-use-vector.ll | 45 +++++++
llvm/test/CodeGen/X86/fake-use-vector2.ll | 33 +++++
.../DebugInfo/X86/Inputs/check-fake-use.py | 100 +++++++++++++++
llvm/test/DebugInfo/X86/fake-use.ll | 98 +++++++++++++++
.../test/Transforms/GVN/fake-use-constprop.ll | 69 +++++++++++
46 files changed, 1362 insertions(+), 8 deletions(-)
create mode 100644 llvm/test/CodeGen/MIR/X86/fake-use-phi.mir
create mode 100644 llvm/test/CodeGen/MIR/X86/fake-use-scheduler.mir
create mode 100644 llvm/test/CodeGen/MIR/X86/fake-use-tailcall.mir
create mode 100644 llvm/test/CodeGen/MIR/X86/fake-use-zero-length.ll
create mode 100644 llvm/test/CodeGen/X86/fake-use-hpfloat.ll
create mode 100644 llvm/test/CodeGen/X86/fake-use-ld.ll
create mode 100644 llvm/test/CodeGen/X86/fake-use-simple-tail-call.ll
create mode 100644 llvm/test/CodeGen/X86/fake-use-split-ret.ll
create mode 100644 llvm/test/CodeGen/X86/fake-use-sroa.ll
create mode 100644 llvm/test/CodeGen/X86/fake-use-suppress-load.ll
create mode 100644 llvm/test/CodeGen/X86/fake-use-tailcall.ll
create mode 100644 llvm/test/CodeGen/X86/fake-use-vector.ll
create mode 100644 llvm/test/CodeGen/X86/fake-use-vector2.ll
create mode 100644 llvm/test/DebugInfo/X86/Inputs/check-fake-use.py
create mode 100644 llvm/test/DebugInfo/X86/fake-use.ll
create mode 100644 llvm/test/Transforms/GVN/fake-use-constprop.ll
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 8c696cb16e77f8..cf0a6f96fb012e 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -29477,6 +29477,42 @@ execution, but is unknown at compile time.
If the result value does not fit in the result type, then the result is
a :ref:`poison value <poisonvalues>`.
+.. _llvm_fake_use:
+
+'``llvm.fake.use``' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+::
+
+ declare void @llvm.fake.use(...)
+
+Overview:
+"""""""""
+
+The ``llvm.fake.use`` intrinsic is a no-op. It takes a single
+value as an operand and is treated as a use of that operand, to force the
+optimizer to preserve that value prior to the fake use. This is used for
+extending the lifetimes of variables, where this intrinsic placed at the end of
+a variable's scope helps prevent that variable from being optimized out.
+
+Arguments:
+""""""""""
+
+The ``llvm.fake.use`` intrinsic takes one argument, which may be any
+function-local SSA value. Note that the signature is variadic so that the
+intrinsic can take any type of argument, but passing more than one argument will
+result in an error.
+
+Semantics:
+""""""""""
+
+This intrinsic does nothing, but optimizers must consider it a use of its single
+operand and should try to preserve the intrinsic and its position in the
+function.
+
Stack Map Intrinsics
--------------------
diff --git a/llvm/include/llvm/Analysis/PtrUseVisitor.h b/llvm/include/llvm/Analysis/PtrUseVisitor.h
index b6cc14d2077af0..e49ec454404bb5 100644
--- a/llvm/include/llvm/Analysis/PtrUseVisitor.h
+++ b/llvm/include/llvm/Analysis/PtrUseVisitor.h
@@ -278,6 +278,7 @@ class PtrUseVisitor : protected InstVisitor<DerivedT>,
default:
return Base::visitIntrinsicInst(II);
+ case Intrinsic::fake_use:
case Intrinsic::lifetime_start:
case Intrinsic::lifetime_end:
return; // No-op intrinsics.
diff --git a/llvm/include/llvm/CodeGen/ISDOpcodes.h b/llvm/include/llvm/CodeGen/ISDOpcodes.h
index 86ff2628975942..187d624f0a73b9 100644
--- a/llvm/include/llvm/CodeGen/ISDOpcodes.h
+++ b/llvm/include/llvm/CodeGen/ISDOpcodes.h
@@ -1372,6 +1372,11 @@ enum NodeType {
LIFETIME_START,
LIFETIME_END,
+ /// FAKE_USE represents a use of the operand but does not do anything.
+ /// Its purpose is the extension of the operand's lifetime mainly for
+ /// debugging purposes.
+ FAKE_USE,
+
/// GC_TRANSITION_START/GC_TRANSITION_END - These operators mark the
/// beginning and end of GC transition sequence, and carry arbitrary
/// information that target might need for lowering. The first operand is
diff --git a/llvm/include/llvm/CodeGen/MachineInstr.h b/llvm/include/llvm/CodeGen/MachineInstr.h
index 04c8144f2fe7af..62667cc8ef3800 100644
--- a/llvm/include/llvm/CodeGen/MachineInstr.h
+++ b/llvm/include/llvm/CodeGen/MachineInstr.h
@@ -1435,6 +1435,8 @@ class MachineInstr
return getOpcode() == TargetOpcode::EXTRACT_SUBREG;
}
+ bool isFakeUse() const { return getOpcode() == TargetOpcode::FAKE_USE; }
+
/// Return true if the instruction behaves like a copy.
/// This does not include native copy instructions.
bool isCopyLike() const {
diff --git a/llvm/include/llvm/CodeGen/SelectionDAGISel.h b/llvm/include/llvm/CodeGen/SelectionDAGISel.h
index fc0590b1a1b69e..f6191c6fdb7fe6 100644
--- a/llvm/include/llvm/CodeGen/SelectionDAGISel.h
+++ b/llvm/include/llvm/CodeGen/SelectionDAGISel.h
@@ -463,6 +463,7 @@ class SelectionDAGISel {
void Select_READ_REGISTER(SDNode *Op);
void Select_WRITE_REGISTER(SDNode *Op);
void Select_UNDEF(SDNode *N);
+ void Select_FAKE_USE(SDNode *N);
void CannotYetSelect(SDNode *N);
void Select_FREEZE(SDNode *N);
diff --git a/llvm/include/llvm/IR/Intrinsics.td b/llvm/include/llvm/IR/Intrinsics.td
index e3bf0446575ae5..232d6be1073f49 100644
--- a/llvm/include/llvm/IR/Intrinsics.td
+++ b/llvm/include/llvm/IR/Intrinsics.td
@@ -1835,6 +1835,9 @@ def int_is_constant : DefaultAttrsIntrinsic<[llvm_i1_ty], [llvm_any_ty],
[IntrNoMem, IntrWillReturn, IntrConvergent],
"llvm.is.constant">;
+// Introduce a use of the argument without generating any code.
+def int_fake_use : Intrinsic<[], [llvm_vararg_ty]>;
+
// Intrinsic to mask out bits of a pointer.
// First argument must be pointer or vector of pointer. This is checked by the
// verifier.
diff --git a/llvm/include/llvm/Support/TargetOpcodes.def b/llvm/include/llvm/Support/TargetOpcodes.def
index 9fb6de49fb2055..635c265a433631 100644
--- a/llvm/include/llvm/Support/TargetOpcodes.def
+++ b/llvm/include/llvm/Support/TargetOpcodes.def
@@ -217,6 +217,9 @@ HANDLE_TARGET_OPCODE(PATCHABLE_TYPED_EVENT_CALL)
HANDLE_TARGET_OPCODE(ICALL_BRANCH_FUNNEL)
+/// Represents a use of the operand but generates no code.
+HANDLE_TARGET_OPCODE(FAKE_USE)
+
// This is a fence with the singlethread scope. It represents a compiler memory
// barrier, but does not correspond to any generated instruction.
HANDLE_TARGET_OPCODE(MEMBARRIER)
diff --git a/llvm/include/llvm/Target/Target.td b/llvm/include/llvm/Target/Target.td
index 34332386085870..ee24e9b1ea7c98 100644
--- a/llvm/include/llvm/Target/Target.td
+++ b/llvm/include/llvm/Target/Target.td
@@ -1418,6 +1418,14 @@ def FAULTING_OP : StandardPseudoInstruction {
let isTerminator = true;
let isBranch = true;
}
+def FAKE_USE : StandardPseudoInstruction {
+ // An instruction that uses its operands but does nothing.
+ let OutOperandList = (outs);
+ let InOperandList = (ins variable_ops);
+ let AsmString = "FAKE_USE";
+ let hasSideEffects = 0;
+ let isMeta = true;
+}
def PATCHABLE_OP : StandardPseudoInstruction {
let OutOperandList = (outs);
let InOperandList = (ins variable_ops);
diff --git a/llvm/lib/CodeGen/Analysis.cpp b/llvm/lib/CodeGen/Analysis.cpp
index 128060ec912c76..f77b733c6c8f69 100644
--- a/llvm/lib/CodeGen/Analysis.cpp
+++ b/llvm/lib/CodeGen/Analysis.cpp
@@ -567,7 +567,8 @@ bool llvm::isInTailCallPosition(const CallBase &Call, const TargetMachine &TM,
if (const IntrinsicInst *II = dyn_cast<IntrinsicInst>(BBI))
if (II->getIntrinsicID() == Intrinsic::lifetime_end ||
II->getIntrinsicID() == Intrinsic::assume ||
- II->getIntrinsicID() == Intrinsic::experimental_noalias_scope_decl)
+ II->getIntrinsicID() == Intrinsic::experimental_noalias_scope_decl ||
+ II->getIntrinsicID() == Intrinsic::fake_use)
continue;
if (BBI->mayHaveSideEffects() || BBI->mayReadFromMemory() ||
!isSafeToSpeculativelyExecute(&*BBI))
diff --git a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
index 60cb26973ead41..3d21f30ef4dc72 100644
--- a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
@@ -1073,11 +1073,46 @@ void AsmPrinter::emitFunctionEntryLabel() {
}
}
+// Recognize cases where a spilled register is reloaded solely to feed into a
+// FAKE_USE.
+static bool isLoadFeedingIntoFakeUse(const MachineInstr &MI) {
+ const MachineFunction *MF = MI.getMF();
+ const TargetInstrInfo *TII = MF->getSubtarget().getInstrInfo();
+
+ // If the restore size is std::nullopt then we are not dealing with a reload
+ // of a spilled register.
+ if (!MI.getRestoreSize(TII))
+ return false;
+
+ // Check if this register is the operand of a FAKE_USE and
+ // does it have the kill flag set there.
+ auto NextI = std::next(MI.getIterator());
+ if (NextI == MI.getParent()->end() || !NextI->isFakeUse())
+ return false;
+
+ unsigned Reg = MI.getOperand(0).getReg();
+ for (const MachineOperand &MO : NextI->operands()) {
+ // Return true if we came across the register from the
+ // previous spill instruction that is killed in NextI.
+ if (MO.isReg() && MO.isUse() && MO.isKill() && MO.getReg() == Reg)
+ return true;
+ }
+
+ return false;
+}
+
/// emitComments - Pretty-print comments for instructions.
static void emitComments(const MachineInstr &MI, raw_ostream &CommentOS) {
const MachineFunction *MF = MI.getMF();
const TargetInstrInfo *TII = MF->getSubtarget().getInstrInfo();
+ // If this is a reload of a spilled register that only feeds into a FAKE_USE
+ // instruction, meaning the load value has no effect on the program and has
+ // only been kept alive for debugging; since it is still available on the
+ // stack, we can skip the load itself.
+ if (isLoadFeedingIntoFakeUse(MI))
+ return;
+
// Check for spills and reloads
// We assume a single instruction only has a spill or reload, not
@@ -1799,6 +1834,8 @@ void AsmPrinter::emitFunctionBody() {
case TargetOpcode::KILL:
if (isVerbose()) emitKill(&MI, *this);
break;
+ case TargetOpcode::FAKE_USE:
+ break;
case TargetOpcode::PSEUDO_PROBE:
emitPseudoProbe(MI);
break;
@@ -1814,6 +1851,12 @@ void AsmPrinter::emitFunctionBody() {
// purely meta information.
break;
default:
+ // If this is a reload of a spilled register that only feeds into a
+ // FAKE_USE instruction, meaning the load value has no effect on the
+ // program and has only been kept alive for debugging; since it is
+ // still available on the stack, we can skip the load itself.
+ if (isLoadFeedingIntoFakeUse(MI))
+ break;
emitInstruction(&MI);
if (CanDoExtraAnalysis) {
MCInst MCI;
diff --git a/llvm/lib/CodeGen/CodeGenPrepare.cpp b/llvm/lib/CodeGen/CodeGenPrepare.cpp
index da6c758d53d487..74169fb885c93c 100644
--- a/llvm/lib/CodeGen/CodeGenPrepare.cpp
+++ b/llvm/lib/CodeGen/CodeGenPrepare.cpp
@@ -2800,12 +2800,34 @@ bool CodeGenPrepare::dupRetToEnableTailCallOpts(BasicBlock *BB,
return false;
};
+ SmallVector<const IntrinsicInst *, 4> FakeUses;
+
+ auto isFakeUse = [&FakeUses](const Instruction *Inst) {
+ if (auto *II = dyn_cast<IntrinsicInst>(Inst);
+ II && II->getIntrinsicID() == Intrinsic::fake_use) {
+ // Record the instruction so it can be preserved when the exit block is
+ // removed. Do not preserve the fake use that uses the result of the
+ // PHI instruction.
+ // Do not copy fake uses that use the result of a PHI node.
+ // FIXME: If we do want to copy the fake use into the return blocks, we
+ // have to figure out which of the PHI node operands to use for each
+ // copy.
+ if (!isa<PHINode>(II->getOperand(0))) {
+ FakeUses.push_back(II);
+ }
+ return true;
+ }
+
+ return false;
+ };
+
// Make sure there are no instructions between the first instruction
// and return.
const Instruction *BI = BB->getFirstNonPHI();
// Skip over debug and the bitcast.
while (isa<DbgInfoIntrinsic>(BI) || BI == BCI || BI == EVI ||
- isa<PseudoProbeInst>(BI) || isLifetimeEndOrBitCastFor(BI))
+ isa<PseudoProbeInst>(BI) || isLifetimeEndOrBitCastFor(BI) ||
+ isFakeUse(BI))
BI = BI->getNextNode();
if (BI != RetI)
return false;
@@ -2814,6 +2836,7 @@ bool CodeGenPrepare::dupRetToEnableTailCallOpts(BasicBlock *BB,
/// call.
const Function *F = BB->getParent();
SmallVector<BasicBlock *, 4> TailCallBBs;
+ SmallVector<CallInst *, 4> CallInsts;
if (PN) {
for (unsigned I = 0, E = PN->getNumIncomingValues(); I != E; ++I) {
// Look through bitcasts.
@@ -2847,6 +2870,9 @@ bool CodeGenPrepare::dupRetToEnableTailCallOpts(BasicBlock *BB,
attributesPermitTailCall(F, CI, RetI, *TLI))
TailCallBBs.push_back(PredBB);
}
+ // Record the call instruction so we can insert any fake uses
+ // that need to be preserved before it.
+ CallInsts.push_back(CI);
}
} else {
SmallPtrSet<BasicBlock *, 4> VisitedBBs;
@@ -2863,6 +2889,9 @@ bool CodeGenPrepare::dupRetToEnableTailCallOpts(BasicBlock *BB,
(isIntrinsicOrLFToBeTailCalled(TLInfo, CI) &&
V == CI->getArgOperand(0))) {
TailCallBBs.push_back(Pred);
+ // Record the call instruction so we can insert any fake uses
+ // that need to be preserved before it.
+ CallInsts.push_back(CI);
}
}
}
@@ -2889,8 +2918,17 @@ bool CodeGenPrepare::dupRetToEnableTailCallOpts(BasicBlock *BB,
}
// If we eliminated all predecessors of the block, delete the block now.
- if (Changed && !BB->hasAddressTaken() && pred_empty(BB))
+ if (Changed && !BB->hasAddressTaken() && pred_empty(BB)) {
+ // Copy the fake uses found in the original return block to all blocks
+ // that contain tail calls.
+ for (auto *CI : CallInsts) {
+ for (auto const *FakeUse : FakeUses) {
+ auto *ClonedInst = FakeUse->clone();
+ ClonedInst->insertBefore(CI);
+ }
+ }
BB->eraseFromParent();
+ }
return Changed;
}
diff --git a/llvm/lib/CodeGen/DeadMachineInstructionElim.cpp b/llvm/lib/CodeGen/DeadMachineInstructionElim.cpp
index 7fc25cd889a0df..332ed37bd2b79f 100644
--- a/llvm/lib/CodeGen/DeadMachineInstructionElim.cpp
+++ b/llvm/lib/CodeGen/DeadMachineInstructionElim.cpp
@@ -87,7 +87,8 @@ bool DeadMachineInstructionElimImpl::isDead(const MachineInstr *MI) const {
return false;
// Don't delete frame allocation labels.
- if (MI->getOpcode() == TargetOpcode::LOCAL_ESCAPE)
+ if (MI->getOpcode() == TargetOpcode::LOCAL_ESCAPE ||
+ MI->getOpcode() == TargetOpcode::FAKE_USE)
return false;
// Don't delete instructions with side effects.
diff --git a/llvm/lib/CodeGen/MachineCSE.cpp b/llvm/lib/CodeGen/MachineCSE.cpp
index 27bbf5599b6046..5ce947e807cf9d 100644
--- a/llvm/lib/CodeGen/MachineCSE.cpp
+++ b/llvm/lib/CodeGen/MachineCSE.cpp
@@ -406,7 +406,7 @@ bool MachineCSE::PhysRegDefsReach(MachineInstr *CSMI, MachineInstr *MI,
bool MachineCSE::isCSECandidate(MachineInstr *MI) {
if (MI->isPosition() || MI->isPHI() || MI->isImplicitDef() || MI->isKill() ||
- MI->isInlineAsm() || MI->isDebugInstr() || MI->isJumpTableDebugInfo())
+ MI->isInlineAsm() || MI->isDebugInstr() || MI->isJumpTableDebugInfo() || MI->isFakeUse())
return false;
// Ignore copies.
diff --git a/llvm/lib/CodeGen/MachineScheduler.cpp b/llvm/lib/CodeGen/MachineScheduler.cpp
index 7a3cf96ccffe0a..4e6d34346b1d80 100644
--- a/llvm/lib/CodeGen/MachineScheduler.cpp
+++ b/llvm/lib/CodeGen/MachineScheduler.cpp
@@ -530,7 +530,8 @@ static bool isSchedBoundary(MachineBasicBlock::iterator MI,
MachineBasicBlock *MBB,
MachineFunction *MF,
const TargetInstrInfo *TII) {
- return MI->isCall() || TII->isSchedulingBoundary(*MI, MBB, *MF);
+ return MI->isCall() || TII->isSchedulingBoundary(*MI, MBB, *MF) ||
+ MI->isFakeUse();
}
/// A region of an MBB for scheduling.
diff --git a/llvm/lib/CodeGen/SelectionDAG/FastISel.cpp b/llvm/lib/CodeGen/SelectionDAG/FastISel.cpp
index 067f82c99adca1..162af2d9d708a9 100644
--- a/llvm/lib/CodeGen/SelectionDAG/FastISel.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/FastISel.cpp
@@ -1464,6 +1464,9 @@ bool FastISel::selectIntrinsicCall(const IntrinsicInst *II) {
updateValueMap(II, ResultReg);
return true;
}
+ case Intrinsic::fake_use:
+ // At -O0, we don't need fake use, so just ignore it.
+ return true;
case Intrinsic::experimental_stackmap:
return selectStackmap(II);
case Intrinsic::experimental_patchpoint_void:
diff --git a/llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp b/llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp
index 221dcfe145594f..b5c80005a0ecc1 100644
--- a/llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp
@@ -2438,6 +2438,9 @@ bool DAGTypeLegalizer::PromoteFloatOperand(SDNode *N, unsigned OpNo) {
report_fatal_error("Do not know how to promote this operator's operand!");
case ISD::BITCAST: R = PromoteFloatOp_BITCAST(N, OpNo); break;
+ case ISD::FAKE_USE:
+ R = PromoteFloatOp_FAKE_USE(N, OpNo);
+ break;
case ISD::FCOPYSIGN: R = PromoteFloatOp_FCOPYSIGN(N, OpNo); break;
case ISD::FP_TO_SINT:
case ISD::FP_TO_UINT:
@@ -2480,6 +2483,13 @@ SDValue DAGTypeLegalizer::PromoteFloatOp_BITCAST(SDNode *N, unsigned OpNo) {
return DAG.getBitcast(N->getValueType(0), Convert);
}
+SDValue DAGTypeLegalizer::PromoteFloatOp_FAKE_USE(SDNode *N, unsigned OpNo) {
+ assert(OpNo == 1 && "Only Operand 1 must need promotion here");
+ SDValue Op = GetPromotedFloat(N->getOperand(OpNo));
+ return DAG.getNode(N->getOpcode(), SDLoc(N), MVT::Other, N->getOperand(0),
+ Op);
+}
+
// Promote Operand 1 of FCOPYSIGN. Operand 0 ought to be handled by
// PromoteFloatRes_FCOPYSIGN.
SDValue DAGTypeLegalizer::PromoteFloatOp_FCOPYSIGN(SDNode *N, unsigned OpNo) {
@@ -3433,6 +3443,9 @@ bool DAGTypeLegalizer::SoftPromoteHalfOperand(SDNode *N, unsigned OpNo) {
"operand!");
case ISD::BITCAST: Res = SoftPromoteHalfOp_BITCAST(N); break;
+ case ISD::FAKE_USE:
+ Res = SoftPromoteHalfOp_FAKE_USE(N, OpNo);
+ break;
case ISD::FCOPYSIGN: Res = SoftPromoteHalfOp_FCOPYSIGN(N, OpNo); break;
case ISD::FP_TO_SINT:
case ISD::FP_TO_UINT: Res = SoftPromoteHalfOp_FP_TO_XINT(N); break;
@@ -3473,6 +3486,13 @@ SDValue DAGTypeLegalizer::SoftPromoteHalfOp_BITCAST(SDNode *N) {
return DAG.getNode(ISD::BITCAST, SDLoc(N), N->getValueType(0), Op0);
}
+SDValue DAGTypeLegalizer::SoftPromoteHalfOp_FAKE_USE(SDNode *N, unsigned OpNo) {
+ assert(OpNo == 1 && "Only Operand 1 must need promotion here");
+ SDValue Op = GetSoftPromotedHalf(N->getOperand(OpNo));
+ return DAG.getNode(N->getOpcode(), SDLoc(N), MVT::Other, N->getOperand(0),
+ Op);
+}
+
SDValue DAGTypeLegalizer::SoftPromoteHalfOp_FCOPYSIGN(SDNode *N,
unsigned OpNo) {
assert(OpNo == 1 && "Only Operand 1 must need promotion here");
diff --git a/llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp b/llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
index c19a5a4995627a..05971152d535c6 100644
--- a/llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
@@ -1934,6 +1934,9 @@ bool DAGTypeLegalizer::PromoteIntegerOperand(SDNode *N, unsigned OpNo) {
case ISD::BUILD_VECTOR: Res = PromoteIntOp_BUILD_VECTOR(N); break;
case ISD::CONCAT_VECTORS: Res = PromoteIntOp_CONCAT_VECTORS(N); break;
case ISD::EXTRACT_VECTOR_ELT: Res = PromoteIntOp_EXTRACT_VECTOR_ELT(N); break;
+ case ISD::FAKE_USE:
+ Res = PromoteIntOp_FAKE_USE(N);
+ break;
case ISD::INSERT_VECTOR_ELT:
Res = PromoteIntOp_INSERT_VECTOR_ELT(N, OpNo);
break;
@@ -5280,6 +5283,9 @@ bool DAGTypeLegalizer::ExpandIntegerOperand(SDNode *N, unsigned OpNo) {
case ISD::BR_CC: Res = ExpandIntOp_BR_CC(N); break;
case ISD::BUILD_VECTOR: Res = ExpandOp_BUILD_VECTOR(N); break;
case ISD::EXTRACT_ELEMENT: Res = ExpandOp_EXTRACT_ELEMENT(N); break;
+ case ISD::FAKE_USE:
+ Res = ExpandOp_FAKE_USE(N);
+ break;
case ISD::INSERT_VECTOR_ELT: Res = ExpandOp_INSERT_VECTOR_ELT(N); break;
case ISD::SCALAR_TO_VECTOR: Res = ExpandOp_SCALAR_TO_VECTOR(N); break;
case ISD::EXPERIMENTAL_VP_SPLAT:
@@ -6115,6 +6121,19 @@ SDValue DAGTypeLegalizer::PromoteIntOp_INSERT_SUBVECTOR(SDNode *N) {
return DAG.getAnyExtOrTrunc(Ext, dl, N->getValueType(0));
}
+// FIXME: We wouldn't need this if clang could promote short integers
+// that are arguments to FAKE_USE.
+SDValue DAGTypeLegalizer::PromoteIntOp_FAKE_USE(SDNode *N) {
+ SDLoc dl(N);
+ SDValue V0 = N->getOperand(0);
+ SDValue V1 = N->getOperand(1);
+ EVT InVT1 = V1.getValueType();
+ SDValue VPromoted =
+ DAG.getNode(ISD::ANY_EXTEND, dl,
+ TLI.getTypeToTransformTo(*DAG.getContext(), InVT1), V1);
+ return DAG.getNode(N->getOpcode(), dl, N->getValueType(0), V0, VPromoted);
+}
+
SDValue DAGTypeLegalizer::PromoteIntOp_EXTRACT_SUBVECTOR(SDNode *N) {
SDLoc dl(N);
SDValue V0 = GetPromotedInteger(N->getOperand(0));
diff --git a/llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h b/llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h
index 1088db4bdbe0b3..4577346a02d605 100644
--- a/llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h
+++ b/llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h
@@ -391,6 +391,7 @@ class LLVM_LIBRARY_VISIBILITY DAGTypeLegalizer {
SDValue PromoteIntOp_EXTRACT_VECTOR_ELT(SDNode *N);
SDValue PromoteIntOp_EXTRACT_SUBVECTOR(SDNode *N);
SDValue PromoteIntOp_INSERT_SUBVECTOR(SDNode *N);
+ SDValue PromoteIntOp_FAKE_USE(SDNode *N);
SDValue PromoteIntOp_CONCAT_VECTORS(SDNode *N);
SDValue PromoteIntOp_ScalarOp(SDNode *N);
SDValue PromoteIntOp_SELECT(SDNode *N, unsigned OpNo);
@@ -755,6 +756,7 @@ class LLVM_LIBRARY_VISIBILITY DAGTypeLegalizer {
bool PromoteFloatOperand(SDNode *N, unsigned OpNo);
SDValue PromoteFloatOp_BITCAST(SDNode *N, unsigned OpNo);
+ SDValue PromoteFloatOp_FAKE_USE(SDNode *N, unsigned OpNo);
SDValue PromoteFloatOp_FCOPYSIGN(SDNode *N, unsigned OpNo);
SDValue PromoteFloatOp_FP_EXTEND(SDNode *N, unsigned OpNo);
SDValue PromoteFloatOp_STRICT_FP_EXTEND(SDNode *N, unsigned OpNo);
@@ -800,6 +802,7 @@ class LLVM_LIBRARY_VISIBILITY DAGTypeLegalizer {
bool SoftPromoteHalfOperand(SDNode *N, unsigned OpNo);
SDValue SoftPromoteHalfOp_BITCAST(SDNode *N);
+ SDValue SoftPromoteHalfOp_FAKE_USE(SDNode *N, unsigned OpNo);
SDValue SoftPromoteHalfOp_FCOPYSIGN(SDNode *N, unsigned OpNo);
SDValue SoftPromoteHalfOp_FP_EXTEND(SDNode *N);
SDValue SoftPromoteHalfOp_FP_TO_XINT(SDNode *N);
@@ -877,6 +880,7 @@ class LLVM_LIBRARY_VISIBILITY DAGTypeLegalizer {
SDValue ScalarizeVecOp_VECREDUCE(SDNode *N);
SDValue ScalarizeVecOp_VECREDUCE_SEQ(SDNode *N);
SDValue ScalarizeVecOp_CMP(SDNode *N);
+ SDValue ScalarizeVecOp_FAKE_USE(SDNode *N);
//===--------------------------------------------------------------------===//
// Vector Splitting Support: LegalizeVectorTypes.cpp
@@ -964,6 +968,7 @@ class LLVM_LIBRARY_VISIBILITY DAGTypeLegalizer {
SDValue SplitVecOp_EXTRACT_SUBVECTOR(SDNode *N);
SDValue SplitVecOp_EXTRACT_VECTOR_ELT(SDNode *N);
SDValue SplitVecOp_ExtVecInRegOp(SDNode *N);
+ SDValue SplitVecOp_FAKE_USE(SDNode *N);
SDValue SplitVecOp_STORE(StoreSDNode *N, unsigned OpNo);
SDValue SplitVecOp_VP_STORE(VPStoreSDNode *N, unsigned OpNo);
SDValue SplitVecOp_VP_STRIDED_STORE(VPStridedStoreSDNode *N, unsigned OpNo);
@@ -1069,6 +1074,7 @@ class LLVM_LIBRARY_VISIBILITY DAGTypeLegalizer {
SDValue WidenVecOp_INSERT_SUBVECTOR(SDNode *N);
SDValue WidenVecOp_EXTRACT_SUBVECTOR(SDNode *N);
SDValue WidenVecOp_EXTEND_VECTOR_INREG(SDNode *N);
+ SDValue WidenVecOp_FAKE_USE(SDNode *N);
SDValue WidenVecOp_STORE(SDNode* N);
SDValue WidenVecOp_VP_STORE(SDNode *N, unsigned OpNo);
SDValue WidenVecOp_VP_STRIDED_STORE(SDNode *N, unsigned OpNo);
@@ -1198,6 +1204,7 @@ class LLVM_LIBRARY_VISIBILITY DAGTypeLegalizer {
SDValue ExpandOp_BITCAST (SDNode *N);
SDValue ExpandOp_BUILD_VECTOR (SDNode *N);
SDValue ExpandOp_EXTRACT_ELEMENT (SDNode *N);
+ SDValue ExpandOp_FAKE_USE(SDNode *N);
SDValue ExpandOp_INSERT_VECTOR_ELT(SDNode *N);
SDValue ExpandOp_SCALAR_TO_VECTOR (SDNode *N);
SDValue ExpandOp_NormalStore (SDNode *N, unsigned OpNo);
diff --git a/llvm/lib/CodeGen/SelectionDAG/LegalizeTypesGeneric.cpp b/llvm/lib/CodeGen/SelectionDAG/LegalizeTypesGeneric.cpp
index a55364ea2c4e5b..b402e823762764 100644
--- a/llvm/lib/CodeGen/SelectionDAG/LegalizeTypesGeneric.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/LegalizeTypesGeneric.cpp
@@ -403,6 +403,17 @@ SDValue DAGTypeLegalizer::ExpandOp_EXTRACT_ELEMENT(SDNode *N) {
return N->getConstantOperandVal(1) ? Hi : Lo;
}
+// Split the integer operand in two and create a second FAKE_USE node for
+// the other half. The original SDNode is updated in place.
+SDValue DAGTypeLegalizer::ExpandOp_FAKE_USE(SDNode *N) {
+ SDValue Lo, Hi;
+ SDValue Chain = N->getOperand(0);
+ GetExpandedOp(N->getOperand(1), Lo, Hi);
+ SDValue LoUse = DAG.getNode(ISD::FAKE_USE, SDLoc(), MVT::Other, Chain, Lo);
+ DAG.UpdateNodeOperands(N, LoUse, Hi);
+ return SDValue(N, 0);
+}
+
SDValue DAGTypeLegalizer::ExpandOp_INSERT_VECTOR_ELT(SDNode *N) {
// The vector type is legal but the element type needs expansion.
EVT VecVT = N->getValueType(0);
diff --git a/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp b/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
index 475d5806467d98..4c6da7c5df6b40 100644
--- a/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
@@ -746,6 +746,9 @@ bool DAGTypeLegalizer::ScalarizeVectorOperand(SDNode *N, unsigned OpNo) {
case ISD::BITCAST:
Res = ScalarizeVecOp_BITCAST(N);
break;
+ case ISD::FAKE_USE:
+ Res = ScalarizeVecOp_FAKE_USE(N);
+ break;
case ISD::ANY_EXTEND:
case ISD::ZERO_EXTEND:
case ISD::SIGN_EXTEND:
@@ -846,6 +849,14 @@ SDValue DAGTypeLegalizer::ScalarizeVecOp_BITCAST(SDNode *N) {
N->getValueType(0), Elt);
}
+// Need to legalize vector operands of fake uses. Must be <1 x ty>.
+SDValue DAGTypeLegalizer::ScalarizeVecOp_FAKE_USE(SDNode *N) {
+ assert(N->getOperand(1).getValueType().getVectorNumElements() == 1 &&
+ "Fake Use: Unexpected vector type!");
+ SDValue Elt = GetScalarizedVector(N->getOperand(1));
+ return DAG.getNode(ISD::FAKE_USE, SDLoc(), MVT::Other, N->getOperand(0), Elt);
+}
+
/// If the input is a vector that needs to be scalarized, it must be <1 x ty>.
/// Do the operation on the element instead.
SDValue DAGTypeLegalizer::ScalarizeVecOp_UnaryOp(SDNode *N) {
@@ -3291,6 +3302,9 @@ bool DAGTypeLegalizer::SplitVectorOperand(SDNode *N, unsigned OpNo) {
Res = SplitVecOp_CMP(N);
break;
+ case ISD::FAKE_USE:
+ Res = SplitVecOp_FAKE_USE(N);
+ break;
case ISD::ANY_EXTEND_VECTOR_INREG:
case ISD::SIGN_EXTEND_VECTOR_INREG:
case ISD::ZERO_EXTEND_VECTOR_INREG:
@@ -3505,6 +3519,15 @@ SDValue DAGTypeLegalizer::SplitVecOp_UnaryOp(SDNode *N) {
return DAG.getNode(ISD::CONCAT_VECTORS, dl, ResVT, Lo, Hi);
}
+// Split a FAKE_USE use of a vector into FAKE_USEs of hi and lo part.
+SDValue DAGTypeLegalizer::SplitVecOp_FAKE_USE(SDNode *N) {
+ SDValue Lo, Hi;
+ GetSplitVector(N->getOperand(1), Lo, Hi);
+ SDValue Chain =
+ DAG.getNode(ISD::FAKE_USE, SDLoc(), MVT::Other, N->getOperand(0), Lo);
+ return DAG.getNode(ISD::FAKE_USE, SDLoc(), MVT::Other, Chain, Hi);
+}
+
SDValue DAGTypeLegalizer::SplitVecOp_BITCAST(SDNode *N) {
// For example, i64 = BITCAST v4i16 on alpha. Typically the vector will
// end up being split all the way down to individual components. Convert the
@@ -6466,6 +6489,9 @@ bool DAGTypeLegalizer::WidenVectorOperand(SDNode *N, unsigned OpNo) {
report_fatal_error("Do not know how to widen this operator's operand!");
case ISD::BITCAST: Res = WidenVecOp_BITCAST(N); break;
+ case ISD::FAKE_USE:
+ Res = WidenVecOp_FAKE_USE(N);
+ break;
case ISD::CONCAT_VECTORS: Res = WidenVecOp_CONCAT_VECTORS(N); break;
case ISD::INSERT_SUBVECTOR: Res = WidenVecOp_INSERT_SUBVECTOR(N); break;
case ISD::EXTRACT_SUBVECTOR: Res = WidenVecOp_EXTRACT_SUBVECTOR(N); break;
@@ -6851,6 +6877,16 @@ SDValue DAGTypeLegalizer::WidenVecOp_BITCAST(SDNode *N) {
return CreateStackStoreLoad(InOp, VT);
}
+// Vectors with sizes that are not powers of 2 need to be widened to the
+// next largest power of 2. For example, we may get a vector of 3 32-bit
+// integers or of 6 16-bit integers, both of which have to be widened to a
+// 128-bit vector.
+SDValue DAGTypeLegalizer::WidenVecOp_FAKE_USE(SDNode *N) {
+ SDValue WidenedOp = GetWidenedVector(N->getOperand(1));
+ return DAG.getNode(ISD::FAKE_USE, SDLoc(), MVT::Other, N->getOperand(0),
+ WidenedOp);
+}
+
SDValue DAGTypeLegalizer::WidenVecOp_CONCAT_VECTORS(SDNode *N) {
EVT VT = N->getValueType(0);
EVT EltVT = VT.getVectorElementType();
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
index d103308cce566a..24f4fdfca21941 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
@@ -7702,6 +7702,24 @@ void SelectionDAGBuilder::visitIntrinsicCall(const CallInst &I,
return;
}
+ case Intrinsic::fake_use: {
+ Value *V = I.getArgOperand(0);
+ SDValue Ops[2];
+ // If this fake use uses an argument that has an empty SDValue, it is a
+ // zero-length array or some other type that does not produce a register,
+ // so do not translate a fake use for it.
+ if (isa<Argument>(V) && !NodeMap[V])
+ return;
+ Ops[0] = getRoot();
+ Ops[1] = getValue(V);
+ // Also, do not translate a fake use with an undef operand, or any other
+ // empty SDValues.
+ if (!Ops[1] || Ops[1].isUndef())
+ return;
+ DAG.setRoot(DAG.getNode(ISD::FAKE_USE, sdl, MVT::Other, Ops));
+ return;
+ }
+
case Intrinsic::eh_exceptionpointer:
case Intrinsic::eh_exceptioncode: {
// Get the exception pointer vreg, copy from it, and resize it to fit.
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp
index 001f782f209fdb..a253d1a0e20170 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp
@@ -454,6 +454,8 @@ std::string SDNode::getOperationName(const SelectionDAG *G) const {
case ISD::UBSANTRAP: return "ubsantrap";
case ISD::LIFETIME_START: return "lifetime.start";
case ISD::LIFETIME_END: return "lifetime.end";
+ case ISD::FAKE_USE:
+ return "fake_use";
case ISD::PSEUDO_PROBE:
return "pseudoprobe";
case ISD::GC_TRANSITION_START: return "gc_transition.start";
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
index 09bde54b9aaa5d..8e268d4f4968ea 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
@@ -804,6 +804,50 @@ static void reportFastISelFailure(MachineFunction &MF,
LLVM_DEBUG(dbgs() << R.getMsg() << "\n");
}
+// Detect any fake uses that follow a tail call and move them before the tail
+// call. Ignore fake uses that use values that are def'd by or after the tail
+// call.
+static void preserveFakeUses(BasicBlock::iterator Begin,
+ BasicBlock::iterator End) {
+ BasicBlock::iterator I = End;
+ if (--I == Begin || !isa<ReturnInst>(*I))
+ return;
+ // Detect whether there are any fake uses trailing a (potential) tail call.
+ bool HaveFakeUse = false;
+ bool HaveTailCall = false;
+ do {
+ if (const CallInst *CI = dyn_cast<CallInst>(--I))
+ if (CI->isTailCall()) {
+ HaveTailCall = true;
+ break;
+ }
+ if (const IntrinsicInst *II = dyn_cast<IntrinsicInst>(I))
+ if (II->getIntrinsicID() == Intrinsic::fake_use)
+ HaveFakeUse = true;
+ } while (I != Begin);
+
+ // If we didn't find any tail calls followed by fake uses, we are done.
+ if (!HaveTailCall || !HaveFakeUse)
+ return;
+
+ SmallVector<IntrinsicInst *> FakeUses;
+ // Record the fake uses we found so we can move them to the front of the
+ // tail call. Ignore them if they use a value that is def'd by or after
+ // the tail call.
+ for (BasicBlock::iterator Inst = I; Inst != End; Inst++) {
+ if (IntrinsicInst *FakeUse = dyn_cast<IntrinsicInst>(Inst);
+ FakeUse && FakeUse->getIntrinsicID() == Intrinsic::fake_use) {
+ if (auto UsedDef = dyn_cast<Instruction>(FakeUse->getOperand(0));
+ !UsedDef || UsedDef->getParent() != I->getParent() ||
+ UsedDef->comesBefore(&*I))
+ FakeUses.push_back(FakeUse);
+ }
+ }
+
+ for (auto *Inst : FakeUses)
+ Inst->moveBefore(*Inst->getParent(), I);
+}
+
void SelectionDAGISel::SelectBasicBlock(BasicBlock::const_iterator Begin,
BasicBlock::const_iterator End,
bool &HadTailCall) {
@@ -1665,6 +1709,16 @@ void SelectionDAGISel::SelectAllBasicBlocks(const Function &Fn) {
FuncInfo->VisitedBBs[LLVMBB->getNumber()] = true;
}
+ // Fake uses that follow tail calls are dropped. To avoid this, move
+ // such fake uses in front of the tail call, provided they don't
+ // use anything def'd by or after the tail call.
+ {
+ BasicBlock::iterator BBStart =
+ const_cast<BasicBlock *>(LLVMBB)->getFirstNonPHI()->getIterator();
+ BasicBlock::iterator BBEnd = const_cast<BasicBlock *>(LLVMBB)->end();
+ preserveFakeUses(BBStart, BBEnd);
+ }
+
BasicBlock::const_iterator const Begin =
LLVMBB->getFirstNonPHI()->getIterator();
BasicBlock::const_iterator const End = LLVMBB->end();
@@ -2448,6 +2502,13 @@ void SelectionDAGISel::Select_UNDEF(SDNode *N) {
CurDAG->SelectNodeTo(N, TargetOpcode::IMPLICIT_DEF, N->getValueType(0));
}
+// Use the generic target FAKE_USE target opcode. The chain operand
+// must come last, because InstrEmitter::AddOperand() requires it.
+void SelectionDAGISel::Select_FAKE_USE(SDNode *N) {
+ CurDAG->SelectNodeTo(N, TargetOpcode::FAKE_USE, N->getValueType(0),
+ N->getOperand(1), N->getOperand(0));
+}
+
void SelectionDAGISel::Select_FREEZE(SDNode *N) {
// TODO: We don't have FREEZE pseudo-instruction in MachineInstr-level now.
// If FREEZE instruction is added later, the code below must be changed as
@@ -3219,6 +3280,9 @@ void SelectionDAGISel::SelectCodeCommon(SDNode *NodeToMatch,
case ISD::UNDEF:
Select_UNDEF(NodeToMatch);
return;
+ case ISD::FAKE_USE:
+ Select_FAKE_USE(NodeToMatch);
+ return;
case ISD::FREEZE:
Select_FREEZE(NodeToMatch);
return;
diff --git a/llvm/lib/IR/Instruction.cpp b/llvm/lib/IR/Instruction.cpp
index 6f0f3f244c050c..62d88ce21657b2 100644
--- a/llvm/lib/IR/Instruction.cpp
+++ b/llvm/lib/IR/Instruction.cpp
@@ -1171,7 +1171,10 @@ Instruction::getNextNonDebugInstruction(bool SkipPseudoOp) const {
const Instruction *
Instruction::getPrevNonDebugInstruction(bool SkipPseudoOp) const {
for (const Instruction *I = getPrevNode(); I; I = I->getPrevNode())
- if (!isa<DbgInfoIntrinsic>(I) && !(SkipPseudoOp && isa<PseudoProbeInst>(I)))
+ if (!isa<DbgInfoIntrinsic>(I) &&
+ !(SkipPseudoOp && isa<PseudoProbeInst>(I)) &&
+ !(isa<IntrinsicInst>(I) &&
+ cast<IntrinsicInst>(I)->getIntrinsicID() == Intrinsic::fake_use))
return I;
return nullptr;
}
diff --git a/llvm/lib/IR/Verifier.cpp b/llvm/lib/IR/Verifier.cpp
index 2c0f10a34f919d..79b3ca3b6a5a7e 100644
--- a/llvm/lib/IR/Verifier.cpp
+++ b/llvm/lib/IR/Verifier.cpp
@@ -5128,6 +5128,7 @@ void Verifier::visitInstruction(Instruction &I) {
F->getIntrinsicID() ==
Intrinsic::experimental_patchpoint_void ||
F->getIntrinsicID() == Intrinsic::experimental_patchpoint ||
+ F->getIntrinsicID() == Intrinsic::fake_use ||
F->getIntrinsicID() == Intrinsic::experimental_gc_statepoint ||
F->getIntrinsicID() == Intrinsic::wasm_rethrow ||
IsAttachedCallOperand(F, CBI, i),
diff --git a/llvm/lib/Target/X86/X86FloatingPoint.cpp b/llvm/lib/Target/X86/X86FloatingPoint.cpp
index 02c3ca9839fc2d..ea94a4be32b2fa 100644
--- a/llvm/lib/Target/X86/X86FloatingPoint.cpp
+++ b/llvm/lib/Target/X86/X86FloatingPoint.cpp
@@ -432,6 +432,24 @@ bool FPS::processBasicBlock(MachineFunction &MF, MachineBasicBlock &BB) {
if (MI.isCall())
FPInstClass = X86II::SpecialFP;
+ // A fake_use with a floating point pseudo register argument that is
+ // killed must behave like any other floating point operation and pop
+ // the floating point stack (this is done in handleSpecialFP()).
+ // Fake_use is, however, unusual, in that sometimes its operand is not
+ // killed because a later instruction (probably a return) will use it.
+ // It is this instruction that will pop the stack.
+ // In this scenario we can safely remove the fake_use's operand
+ // (it is live anyway).
+ if (MI.isFakeUse()) {
+ const MachineOperand &MO = MI.getOperand(0);
+ if (MO.isReg() && X86::RFP80RegClass.contains(MO.getReg())) {
+ if (MO.isKill())
+ FPInstClass = X86II::SpecialFP;
+ else
+ MI.removeOperand(0);
+ }
+ }
+
if (FPInstClass == X86II::NotFP)
continue; // Efficiently ignore non-fp insts!
@@ -1737,6 +1755,20 @@ void FPS::handleSpecialFP(MachineBasicBlock::iterator &Inst) {
// Don't delete the inline asm!
return;
}
+
+ // FAKE_USE must pop its register operand off the stack if it is killed,
+ // because this constitutes the register's last use. If the operand
+ // is not killed, it will have its last use later, so we leave it alone.
+ // In either case we remove the operand so later passes don't see it.
+ case TargetOpcode::FAKE_USE: {
+ assert(MI.getNumExplicitOperands() == 1 &&
+ "FAKE_USE must have exactly one operand");
+ if (MI.getOperand(0).isKill()) {
+ freeStackSlotBefore(Inst, getFPReg(MI.getOperand(0)));
+ }
+ MI.removeOperand(0);
+ return;
+ }
}
Inst = MBB->erase(Inst); // Remove the pseudo instruction
diff --git a/llvm/lib/Transforms/Scalar/SROA.cpp b/llvm/lib/Transforms/Scalar/SROA.cpp
index 26b62cb79cdedf..2310cb3a7decbb 100644
--- a/llvm/lib/Transforms/Scalar/SROA.cpp
+++ b/llvm/lib/Transforms/Scalar/SROA.cpp
@@ -3802,6 +3802,12 @@ class AggLoadStoreRewriter : public InstVisitor<AggLoadStoreRewriter, bool> {
struct LoadOpSplitter : public OpSplitter<LoadOpSplitter> {
AAMDNodes AATags;
+ // A vector to hold the split components that we want to emit
+ // separate fake uses for.
+ SmallVector<Value *, 4> Components;
+ // A vector to hold all the fake uses of the struct that we are splitting.
+ // Usually there should only be one, but we are handling the general case.
+ SmallVector<Instruction *, 1> FakeUses;
LoadOpSplitter(Instruction *InsertionPoint, Value *Ptr, Type *BaseTy,
AAMDNodes AATags, Align BaseAlign, const DataLayout &DL,
@@ -3826,10 +3832,32 @@ class AggLoadStoreRewriter : public InstVisitor<AggLoadStoreRewriter, bool> {
GEPOperator::accumulateConstantOffset(BaseTy, GEPIndices, DL, Offset))
Load->setAAMetadata(
AATags.adjustForAccess(Offset.getZExtValue(), Load->getType(), DL));
+ // Record the load so we can generate a fake use for this aggregate
+ // component.
+ Components.push_back(Load);
Agg = IRB.CreateInsertValue(Agg, Load, Indices, Name + ".insert");
LLVM_DEBUG(dbgs() << " to: " << *Load << "\n");
}
+
+ // Stash the fake uses that use the value generated by this instruction.
+ void recordFakeUses(LoadInst &LI) {
+ for (Use &U : LI.uses())
+ if (auto *II = dyn_cast<IntrinsicInst>(U.getUser()))
+ if (II->getIntrinsicID() == Intrinsic::fake_use)
+ FakeUses.push_back(II);
+ }
+
+ // Replace all fake uses of the aggregate with a series of fake uses, one
+ // for each split component.
+ void emitFakeUses() {
+ for (Instruction *I : FakeUses) {
+ IRB.SetInsertPoint(I);
+ for (auto *V : Components)
+ IRB.CreateIntrinsic(Intrinsic::fake_use, {}, {V});
+ I->eraseFromParent();
+ }
+ }
};
bool visitLoadInst(LoadInst &LI) {
@@ -3841,8 +3869,10 @@ class AggLoadStoreRewriter : public InstVisitor<AggLoadStoreRewriter, bool> {
LLVM_DEBUG(dbgs() << " original: " << LI << "\n");
LoadOpSplitter Splitter(&LI, *U, LI.getType(), LI.getAAMetadata(),
getAdjustedAlignment(&LI, 0), DL, IRB);
+ Splitter.recordFakeUses(LI);
Value *V = PoisonValue::get(LI.getType());
Splitter.emitSplitOps(LI.getType(), V, LI.getName() + ".fca");
+ Splitter.emitFakeUses();
Visited.erase(&LI);
LI.replaceAllUsesWith(V);
LI.eraseFromParent();
diff --git a/llvm/lib/Transforms/Utils/CloneFunction.cpp b/llvm/lib/Transforms/Utils/CloneFunction.cpp
index 47e3c03288d979..dc9ca1423f3e79 100644
--- a/llvm/lib/Transforms/Utils/CloneFunction.cpp
+++ b/llvm/lib/Transforms/Utils/CloneFunction.cpp
@@ -513,6 +513,12 @@ void PruningFunctionCloner::CloneBlock(
for (BasicBlock::const_iterator II = StartingInst, IE = --BB->end(); II != IE;
++II) {
+ // Don't clone fake_use as it may suppress many optimizations
+ // due to inlining, especially SROA.
+ if (auto *IntrInst = dyn_cast<IntrinsicInst>(II))
+ if (IntrInst->getIntrinsicID() == Intrinsic::fake_use)
+ continue;
+
Instruction *NewInst = cloneInstruction(II);
NewInst->insertInto(NewBB, NewBB->end());
diff --git a/llvm/lib/Transforms/Utils/Local.cpp b/llvm/lib/Transforms/Utils/Local.cpp
index e4809cd4bb44db..d0669e44f821b3 100644
--- a/llvm/lib/Transforms/Utils/Local.cpp
+++ b/llvm/lib/Transforms/Utils/Local.cpp
@@ -3491,6 +3491,9 @@ static unsigned replaceDominatedUsesWith(Value *From, Value *To,
unsigned Count = 0;
for (Use &U : llvm::make_early_inc_range(From->uses())) {
+ auto *II = dyn_cast<IntrinsicInst>(U.getUser());
+ if (II && II->getIntrinsicID() == Intrinsic::fake_use)
+ continue;
if (!ShouldReplace(Root, U))
continue;
LLVM_DEBUG(dbgs() << "Replace dominated use of '";
diff --git a/llvm/lib/Transforms/Utils/PromoteMemoryToRegister.cpp b/llvm/lib/Transforms/Utils/PromoteMemoryToRegister.cpp
index 5251eb86bca926..1b7912fdf5e304 100644
--- a/llvm/lib/Transforms/Utils/PromoteMemoryToRegister.cpp
+++ b/llvm/lib/Transforms/Utils/PromoteMemoryToRegister.cpp
@@ -80,7 +80,8 @@ bool llvm::isAllocaPromotable(const AllocaInst *AI) {
if (SI->isVolatile())
return false;
} else if (const IntrinsicInst *II = dyn_cast<IntrinsicInst>(U)) {
- if (!II->isLifetimeStartOrEnd() && !II->isDroppable())
+ if (!II->isLifetimeStartOrEnd() && !II->isDroppable() &&
+ II->getIntrinsicID() != Intrinsic::fake_use)
return false;
} else if (const BitCastInst *BCI = dyn_cast<BitCastInst>(U)) {
if (!onlyUsedByLifetimeMarkersOrDroppableInsts(BCI))
diff --git a/llvm/test/CodeGen/MIR/X86/fake-use-phi.mir b/llvm/test/CodeGen/MIR/X86/fake-use-phi.mir
new file mode 100644
index 00000000000000..b8571304d870d8
--- /dev/null
+++ b/llvm/test/CodeGen/MIR/X86/fake-use-phi.mir
@@ -0,0 +1,95 @@
+# RUN: llc < %s -x mir -run-pass=codegenprepare | FileCheck %s --implicit-check-not="llvm.fake.use"
+#
+# When performing return duplication to enable
+# tail call optimization we clone fake uses that exist in the to-be-eliminated
+# return block into the predecessor blocks. When doing this with fake uses
+# of PHI-nodes, they cannot be easily copied, but require the correct operand.
+# We are currently not able to do this correctly, so we suppress the cloning
+# of such fake uses at the moment.
+#
+# There should be no fake use of a call result in any of the resulting return
+# blocks.
+
+
+# CHECK: declare void @llvm.fake.use
+
+# Fake uses of `this` should be duplicated into both return blocks.
+# CHECK: if.then:
+# CHECK: @llvm.fake.use({{.*}}this
+# CHECK: if.else:
+# CHECK: @llvm.fake.use({{.*}}this
+
+--- |
+ source_filename = "test.ll"
+
+ %class.a = type { i8 }
+
+ declare void @llvm.fake.use(...)
+ declare i32 @foo(ptr nonnull dereferenceable(1)) local_unnamed_addr
+ declare i32 @bar(ptr nonnull dereferenceable(1)) local_unnamed_addr
+
+ define hidden void @func(ptr nonnull dereferenceable(1) %this) local_unnamed_addr align 2 {
+ entry:
+ %b = getelementptr inbounds %class.a, ptr %this, i64 0, i32 0
+ %0 = load i8, i8* %b, align 1
+ %tobool.not = icmp eq i8 %0, 0
+ br i1 %tobool.not, label %if.else, label %if.then
+
+ if.then: ; preds = %entry
+ %call = tail call i32 @foo(ptr nonnull dereferenceable(1) %this)
+ %call2 = tail call i32 @bar(ptr nonnull dereferenceable(1) %this)
+ br label %if.end
+
+ if.else: ; preds = %entry
+ %call4 = tail call i32 @bar(ptr nonnull dereferenceable(1) %this)
+ %call5 = tail call i32 @foo(ptr nonnull dereferenceable(1) %this)
+ br label %if.end
+
+ if.end: ; preds = %if.else, %if.then
+ %call4.sink = phi i32 [ %call4, %if.else ], [ %call, %if.then ]
+ notail call void (...) @llvm.fake.use(i32 %call4.sink)
+ notail call void (...) @llvm.fake.use(ptr nonnull %this)
+ ret void
+ }
+
+...
+---
+name: func
+alignment: 16
+exposesReturnsTwice: false
+legalized: false
+regBankSelected: false
+selected: false
+failedISel: false
+tracksRegLiveness: true
+hasWinCFI: false
+registers: []
+liveins: []
+frameInfo:
+ isFrameAddressTaken: false
+ isReturnAddressTaken: false
+ hasStackMap: false
+ hasPatchPoint: false
+ stackSize: 0
+ offsetAdjustment: 0
+ maxAlignment: 1
+ adjustsStack: false
+ hasCalls: false
+ stackProtector: ''
+ maxCallFrameSize: 4294967295
+ cvBytesOfCalleeSavedRegisters: 0
+ hasOpaqueSPAdjustment: false
+ hasVAStart: false
+ hasMustTailInVarArgFunc: false
+ localFrameSize: 0
+ savePoint: ''
+ restorePoint: ''
+fixedStack: []
+stack: []
+callSites: []
+debugValueSubstitutions: []
+constants: []
+machineFunctionInfo: {}
+body: |
+
+...
diff --git a/llvm/test/CodeGen/MIR/X86/fake-use-scheduler.mir b/llvm/test/CodeGen/MIR/X86/fake-use-scheduler.mir
new file mode 100644
index 00000000000000..bb2136e50d5b7e
--- /dev/null
+++ b/llvm/test/CodeGen/MIR/X86/fake-use-scheduler.mir
@@ -0,0 +1,115 @@
+# Prevent the machine scheduler from moving instructions past FAKE_USE.
+# RUN: llc -run-pass machine-scheduler -o - %s | FileCheck %s
+#
+# We make sure that, beginning with the first FAKE_USE instruction,
+# no changes to the sequence of instructions are undertaken by the
+# scheduler. We don't bother to check that the order of the FAKE_USEs
+# remains the same. They should, but it is irrelevant.
+#
+# CHECK: bb.{{.*}}:
+# CHECK: FAKE_USE
+# CHECK-NEXT: FAKE_USE
+# CHECK-NEXT: FAKE_USE
+# CHECK-NEXT: FAKE_USE
+# CHECK-NEXT: COPY
+# CHECK-NEXT: RET
+#
+--- |
+ @glb = common dso_local local_unnamed_addr global [100 x i32] zeroinitializer, align 16
+
+ ; Function Attrs: nounwind uwtable
+ define dso_local i64 @foo(i32* %p) local_unnamed_addr {
+ entry:
+ %0 = load i32, i32* getelementptr inbounds ([100 x i32], [100 x i32]* @glb, i64 0, i64 0), align 16, !tbaa !2
+ store i32 %0, i32* %p, align 4, !tbaa !2
+ %conv = sext i32 %0 to i64
+ %1 = load i32, i32* getelementptr inbounds ([100 x i32], [100 x i32]* @glb, i64 0, i64 1), align 4, !tbaa !2
+ %arrayidx1 = getelementptr inbounds i32, i32* %p, i64 1
+ store i32 %1, i32* %arrayidx1, align 4, !tbaa !2
+ %conv2 = sext i32 %1 to i64
+ %add3 = add nsw i64 %conv2, %conv
+ notail call void (...) @llvm.fake.use(i64 %add3)
+ notail call void (...) @llvm.fake.use(i32 %1)
+ notail call void (...) @llvm.fake.use(i32 %0)
+ notail call void (...) @llvm.fake.use(i32* %p)
+ ret i64 %add3
+ }
+
+ ; Function Attrs: nounwind
+ declare void @llvm.fake.use(...) #1
+
+ ; Function Attrs: nounwind
+ declare void @llvm.stackprotector(i8*, i8**) #1
+
+
+ !llvm.module.flags = !{!0}
+ !llvm.ident = !{!1}
+
+ !0 = !{i32 1, !"wchar_size", i32 4}
+ !1 = !{!"clang version 9.0.0"}
+ !2 = !{!3, !3, i64 0}
+ !3 = !{!"int", !4, i64 0}
+ !4 = !{!"omnipotent char", !5, i64 0}
+ !5 = !{!"Simple C/C++ TBAA"}
+
+...
+---
+name: foo
+alignment: 4
+exposesReturnsTwice: false
+legalized: false
+regBankSelected: false
+selected: false
+failedISel: false
+tracksRegLiveness: true
+hasWinCFI: false
+registers:
+ - { id: 0, class: gr64, preferred-register: '' }
+ - { id: 1, class: gr64_with_sub_8bit, preferred-register: '' }
+ - { id: 2, class: gr32, preferred-register: '' }
+ - { id: 3, class: gr64_with_sub_8bit, preferred-register: '' }
+ - { id: 4, class: gr32, preferred-register: '' }
+ - { id: 5, class: gr64, preferred-register: '' }
+liveins:
+ - { reg: '$rdi', virtual-reg: '%0' }
+frameInfo:
+ isFrameAddressTaken: false
+ isReturnAddressTaken: false
+ hasStackMap: false
+ hasPatchPoint: false
+ stackSize: 0
+ offsetAdjustment: 0
+ maxAlignment: 0
+ adjustsStack: false
+ hasCalls: false
+ stackProtector: ''
+ maxCallFrameSize: 4294967295
+ cvBytesOfCalleeSavedRegisters: 0
+ hasOpaqueSPAdjustment: false
+ hasVAStart: false
+ hasMustTailInVarArgFunc: false
+ localFrameSize: 0
+ savePoint: ''
+ restorePoint: ''
+fixedStack: []
+stack: []
+constants: []
+body: |
+ bb.0.entry:
+ liveins: $rdi
+
+ %0:gr64 = COPY $rdi
+ %1:gr64_with_sub_8bit = MOVSX64rm32 $rip, 1, $noreg, @glb, $noreg :: (dereferenceable load 4 from `i32* getelementptr inbounds ([100 x i32], [100 x i32]* @glb, i64 0, i64 0)`, align 16, !tbaa !2)
+ MOV32mr %0, 1, $noreg, 0, $noreg, %1.sub_32bit :: (store 4 into %ir.p, !tbaa !2)
+ %3:gr64_with_sub_8bit = MOVSX64rm32 $rip, 1, $noreg, @glb + 4, $noreg :: (dereferenceable load 4 from `i32* getelementptr inbounds ([100 x i32], [100 x i32]* @glb, i64 0, i64 1)`, !tbaa !2)
+ MOV32mr %0, 1, $noreg, 4, $noreg, %3.sub_32bit :: (store 4 into %ir.arrayidx1, !tbaa !2)
+ %5:gr64 = COPY %3
+ %5:gr64 = nsw ADD64rr %5, %1, implicit-def dead $eflags
+ FAKE_USE %5
+ FAKE_USE %3.sub_32bit
+ FAKE_USE %1.sub_32bit
+ FAKE_USE %0
+ $rax = COPY %5
+ RET 0, killed $rax
+
+...
diff --git a/llvm/test/CodeGen/MIR/X86/fake-use-tailcall.mir b/llvm/test/CodeGen/MIR/X86/fake-use-tailcall.mir
new file mode 100644
index 00000000000000..89d3854ac95f88
--- /dev/null
+++ b/llvm/test/CodeGen/MIR/X86/fake-use-tailcall.mir
@@ -0,0 +1,106 @@
+# In certain cases CodeGenPrepare folds a return instruction into
+# the return block's predecessor blocks and subsequently deletes the return block.
+# The purpose of this is to enable tail call optimization in the predecessor blocks.
+# Removal of the return block also removes fake use instructions that were present
+# in the return block, potentially causing debug information to be lost.
+#
+# The fix is to clone any fake use instructions that are not dominated by definitions
+# in the return block itself into the predecessor blocks. This test enures that we do so.
+#
+# Generated from the following source with
+# clang -fextend-lifetimes -S -emit-llvm -O2 -mllvm -stop-before=codegenprepare -o test.mir test.c
+#
+# extern int f0();
+# extern int f1();
+#
+# int foo(int i) {
+# int temp = i;
+# if (temp == 0)
+# temp = f0();
+# else
+# temp = f1();
+# return temp;
+# }
+#
+# RUN: llc -run-pass=codegenprepare -o - %s | FileCheck %s
+#
+# CHECK: define{{.*}}foo
+# CHECK: if.then:
+# CHECK-NEXT: call{{.*}}fake.use(i32 %i)
+# CHECK-NEXT: tail call i32{{.*}}@f0
+# CHECK-NEXT: ret
+# CHECK: if.else:
+# CHECK-NEXT: call{{.*}}fake.use(i32 %i)
+# CHECK-NEXT: tail call i32{{.*}}@f1
+# CHECK-NEXT: ret
+
+--- |
+ define hidden i32 @foo(i32 %i) local_unnamed_addr {
+ entry:
+ %cmp = icmp eq i32 %i, 0
+ br i1 %cmp, label %if.then, label %if.else
+
+ if.then:
+ %call = tail call i32 (...) @f0()
+ br label %if.end
+
+ if.else:
+ %call1 = tail call i32 (...) @f1()
+ br label %if.end
+
+ if.end:
+ %temp.0 = phi i32 [ %call, %if.then ], [ %call1, %if.else ]
+ notail call void (...) @llvm.fake.use(i32 %temp.0)
+ notail call void (...) @llvm.fake.use(i32 %i)
+ ret i32 %temp.0
+ }
+ declare i32 @f0(...) local_unnamed_addr
+ declare i32 @f1(...) local_unnamed_addr
+ declare void @llvm.fake.use(...)
+
+ !llvm.module.flags = !{!0}
+ !llvm.ident = !{!1}
+
+ !0 = !{i32 1, !"wchar_size", i32 2}
+ !1 = !{!"clang version 10.0.0"}
+
+...
+---
+name: foo
+alignment: 16
+exposesReturnsTwice: false
+legalized: false
+regBankSelected: false
+selected: false
+failedISel: false
+tracksRegLiveness: true
+hasWinCFI: false
+registers: []
+liveins: []
+frameInfo:
+ isFrameAddressTaken: false
+ isReturnAddressTaken: false
+ hasStackMap: false
+ hasPatchPoint: false
+ stackSize: 0
+ offsetAdjustment: 0
+ maxAlignment: 1
+ adjustsStack: false
+ hasCalls: false
+ stackProtector: ''
+ maxCallFrameSize: 4294967295
+ cvBytesOfCalleeSavedRegisters: 0
+ hasOpaqueSPAdjustment: false
+ hasVAStart: false
+ hasMustTailInVarArgFunc: false
+ localFrameSize: 0
+ savePoint: ''
+ restorePoint: ''
+fixedStack: []
+stack: []
+callSites: []
+constants: []
+machineFunctionInfo: {}
+body: |
+
+...
diff --git a/llvm/test/CodeGen/MIR/X86/fake-use-zero-length.ll b/llvm/test/CodeGen/MIR/X86/fake-use-zero-length.ll
new file mode 100644
index 00000000000000..b23d74814882ab
--- /dev/null
+++ b/llvm/test/CodeGen/MIR/X86/fake-use-zero-length.ll
@@ -0,0 +1,40 @@
+; RUN: llc < %s -stop-after=finalize-isel | FileCheck %s --implicit-check-not=FAKE_USE
+;
+; Make sure SelectionDAG does not crash handling fake uses of zero-length arrays
+; and structs. Check also that they are not propagated.
+;
+; Generated from the following source with
+; clang -fextend-lifetimes -S -emit-llvm -O2 -mllvm -stop-after=safe-stack -o test.mir test.cpp
+;
+; int main ()
+; { int array[0]; }
+;
+;
+; CHECK: liveins: $[[IN_REG:[a-zA-Z0-9]+]]
+; CHECK: %[[IN_VREG:[a-zA-Z0-9]+]]:gr32 = COPY $[[IN_REG]]
+; CHECK: FAKE_USE %[[IN_VREG]]
+
+source_filename = "test.ll"
+target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
+
+define hidden i32 @main([0 x i32] %zero, [1 x i32] %one) local_unnamed_addr {
+entry:
+ notail call void (...) @bar([0 x i32] %zero)
+ notail call void (...) @baz([1 x i32] %one)
+ notail call void (...) @llvm.fake.use([0 x i32] %zero)
+ notail call void (...) @llvm.fake.use([1 x i32] %one)
+ ret i32 0
+}
+
+declare void @bar([0 x i32] %a)
+declare void @baz([1 x i32] %a)
+
+; Function Attrs: nounwind
+declare void @llvm.fake.use(...)
+
+!llvm.module.flags = !{!0, !1}
+!llvm.ident = !{!2}
+
+!0 = !{i32 1, !"wchar_size", i32 2}
+!1 = !{i32 7, !"PIC Level", i32 2}
+!2 = !{!"clang version 10.0.0"}
\ No newline at end of file
diff --git a/llvm/test/CodeGen/X86/fake-use-hpfloat.ll b/llvm/test/CodeGen/X86/fake-use-hpfloat.ll
new file mode 100644
index 00000000000000..9ec53fb9558c55
--- /dev/null
+++ b/llvm/test/CodeGen/X86/fake-use-hpfloat.ll
@@ -0,0 +1,17 @@
+; assert in DAGlegalizer with fake use of half precision float.
+; Changes to half float promotion.
+; RUN: llc -O2 -stop-after=finalize-isel -filetype=asm -o - %s | FileCheck %s
+;
+; CHECK: bb.0.entry:
+; CHECK-NEXT: %0:fr16 = FsFLD0SH
+; CHECK-NEXT: FAKE_USE killed %0
+;
+target triple = "x86_64-unknown-unknown"
+
+define void @_Z6doTestv() local_unnamed_addr {
+entry:
+ tail call void (...) @llvm.fake.use(half 0xH0000)
+ ret void
+}
+
+declare void @llvm.fake.use(...)
diff --git a/llvm/test/CodeGen/X86/fake-use-ld.ll b/llvm/test/CodeGen/X86/fake-use-ld.ll
new file mode 100644
index 00000000000000..90ecff6dd59680
--- /dev/null
+++ b/llvm/test/CodeGen/X86/fake-use-ld.ll
@@ -0,0 +1,51 @@
+; RUN: llc -O0 -mtriple=x86_64-unknown-unknown < %s | FileCheck %s
+
+; Checks that fake uses of the FP stack do not cause a crash.
+;
+; /*******************************************************************/
+; extern long double foo(long double, long double, long double);
+;
+; long double actual(long double p1, long double p2, long double p3) {
+; return fmal(p1, p2, p3);
+; }
+; /*******************************************************************/
+
+define x86_fp80 @actual(x86_fp80 %p1, x86_fp80 %p2, x86_fp80 %p3) {
+;
+; CHECK: actual
+;
+entry:
+ %p1.addr = alloca x86_fp80, align 16
+ %p2.addr = alloca x86_fp80, align 16
+ %p3.addr = alloca x86_fp80, align 16
+ store x86_fp80 %p1, ptr %p1.addr, align 16
+ store x86_fp80 %p2, ptr %p2.addr, align 16
+ store x86_fp80 %p3, ptr %p3.addr, align 16
+ %0 = load x86_fp80, ptr %p1.addr, align 16
+ %1 = load x86_fp80, ptr %p2.addr, align 16
+ %2 = load x86_fp80, ptr %p3.addr, align 16
+;
+; CHECK: callq{{.*}}foo
+;
+ %3 = call x86_fp80 @foo(x86_fp80 %0, x86_fp80 %1, x86_fp80 %2)
+ %4 = load x86_fp80, ptr %p1.addr, align 16
+ call void (...) @llvm.fake.use(x86_fp80 %4)
+ %5 = load x86_fp80, ptr %p2.addr, align 16
+ call void (...) @llvm.fake.use(x86_fp80 %5)
+ %6 = load x86_fp80, ptr %p3.addr, align 16
+ call void (...) @llvm.fake.use(x86_fp80 %6)
+;
+; CHECK: ret
+;
+ ret x86_fp80 %3
+}
+
+declare x86_fp80 @foo(x86_fp80, x86_fp80, x86_fp80)
+
+declare void @llvm.fake.use(...)
+
+!llvm.module.flags = !{!0}
+!llvm.ident = !{!1}
+
+!0 = !{i32 1, !"PIC Level", i32 2}
+!1 = !{!"clang version 3.9.0"}
diff --git a/llvm/test/CodeGen/X86/fake-use-simple-tail-call.ll b/llvm/test/CodeGen/X86/fake-use-simple-tail-call.ll
new file mode 100644
index 00000000000000..06b4aae48447cc
--- /dev/null
+++ b/llvm/test/CodeGen/X86/fake-use-simple-tail-call.ll
@@ -0,0 +1,33 @@
+; RUN: llc < %s -mtriple=x86_64-unknown-unknown -O2 -o - \
+; RUN: | FileCheck %s --implicit-check-not=TAILCALL
+; Generated with: clang -emit-llvm -O2 -S -fextend-lifetimes test.cpp -o -
+; =========== test.cpp ===============
+; extern int bar(int);
+; int foo1(int i)
+; {
+; return bar(i);
+; }
+; =========== test.cpp ===============
+
+; CHECK: TAILCALL
+
+; ModuleID = 'test.cpp'
+source_filename = "test.cpp"
+
+define i32 @_Z4foo1i(i32 %i) local_unnamed_addr {
+entry:
+ %call = tail call i32 @_Z3bari(i32 %i)
+ tail call void (...) @llvm.fake.use(i32 %i)
+ ret i32 %call
+}
+
+declare i32 @_Z3bari(i32) local_unnamed_addr
+
+declare void @llvm.fake.use(...)
+
+!llvm.module.flags = !{!0, !1}
+!llvm.ident = !{!2}
+
+!0 = !{i32 1, !"wchar_size", i32 2}
+!1 = !{i32 7, !"PIC Level", i32 2}
+!2 = !{!"clang version 5.0.1"}
diff --git a/llvm/test/CodeGen/X86/fake-use-split-ret.ll b/llvm/test/CodeGen/X86/fake-use-split-ret.ll
new file mode 100644
index 00000000000000..eff11105723142
--- /dev/null
+++ b/llvm/test/CodeGen/X86/fake-use-split-ret.ll
@@ -0,0 +1,53 @@
+; RUN: opt -mtriple=x86_64-unknown-unknown -S -codegenprepare <%s -o - | FileCheck %s
+;
+; Ensure return instruction splitting ignores fake uses.
+;
+; IR Generated with clang -O2 -S -emit-llvm -fextend-lifetimes test.cpp
+;
+;// test.cpp
+;extern int bar(int);
+;
+;int foo2(int i)
+;{
+; --i;
+; if (i <= 0)
+; return -1;
+; return bar(i);
+;}
+
+; ModuleID = 'test.cpp'
+source_filename = "test.cpp"
+target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-unknown"
+
+declare i32 @_Z3bari(i32) local_unnamed_addr
+
+; Function Attrs: nounwind
+declare void @llvm.fake.use(...)
+
+; Function Attrs: nounwind sspstrong uwtable
+define i32 @_Z4foo2i(i32 %i) local_unnamed_addr {
+entry:
+ %dec = add nsw i32 %i, -1
+ %cmp = icmp slt i32 %i, 2
+ br i1 %cmp, label %cleanup, label %if.end
+
+if.end: ; preds = %entry
+ %call = tail call i32 @_Z3bari(i32 %dec)
+; CHECK: ret i32 %call
+ br label %cleanup
+
+cleanup: ; preds = %entry, %if.end
+; CHECK: cleanup:
+ %retval.0 = phi i32 [ %call, %if.end ], [ -1, %entry ]
+ tail call void (...) @llvm.fake.use(i32 %dec)
+; CHECK: ret i32 -1
+ ret i32 %retval.0
+}
+
+!llvm.module.flags = !{!0, !1}
+!llvm.ident = !{!2}
+
+!0 = !{i32 1, !"wchar_size", i32 2}
+!1 = !{i32 7, !"PIC Level", i32 2}
+!2 = !{!"clang version 7.0.0"}
diff --git a/llvm/test/CodeGen/X86/fake-use-sroa.ll b/llvm/test/CodeGen/X86/fake-use-sroa.ll
new file mode 100644
index 00000000000000..883a48fdd66954
--- /dev/null
+++ b/llvm/test/CodeGen/X86/fake-use-sroa.ll
@@ -0,0 +1,54 @@
+; RUN: opt -S -passes=sroa %s | FileCheck %s
+; With fake use instrinsics generated for small aggregates, check that when
+; SROA slices the aggregate, we generate individual fake use intrinsics for
+; the individual values.
+
+; Generated from the following source:
+; struct s {
+; int i;
+; int j;
+; };
+;
+; void foo(struct s S) {
+; }
+;
+; void bar() {
+; int arr[2] = {5, 6};
+; }
+;
+%struct.s = type { i32, i32 }
+ at __const.bar.arr = private unnamed_addr constant [2 x i32] [i32 5, i32 6], align 4
+
+; A small struct passed as parameter
+; CHECK-LABEL: define{{.*}}foo
+; CHECK: %[[SLICE1:[^ ]+]] = trunc i64
+; CHECK: %[[SLICE2:[^ ]+]] = trunc i64
+; CHECK-DAG: call{{.*}} @llvm.fake.use(i32 %[[SLICE1]])
+; CHECK-DAG: call{{.*}} @llvm.fake.use(i32 %[[SLICE2]])
+define dso_local void @foo(i64 %S.coerce) {
+entry:
+ %S = alloca %struct.s, align 4
+ store i64 %S.coerce, ptr %S, align 4
+ %fake.use = load %struct.s, ptr %S, align 4
+ notail call void (...) @llvm.fake.use(%struct.s %fake.use)
+ ret void
+}
+
+declare void @llvm.fake.use(...)
+
+; A local variable with a small array type.
+; CHECK-LABEL: define{{.*}}bar
+; CHECK: %[[ARRAYSLICE1:[^ ]+]] = load
+; CHECK: %[[ARRAYSLICE2:[^ ]+]] = load
+; CHECK-DAG: call{{.*}} @llvm.fake.use(i32 %[[ARRAYSLICE1]])
+; CHECK-DAG: call{{.*}} @llvm.fake.use(i32 %[[ARRAYSLICE2]])
+define dso_local void @bar() {
+entry:
+ %arr = alloca [2 x i32], align 4
+ call void @llvm.memcpy.p0i8.p0i8.i64(ptr align 4 %arr, ptr align 4 bitcast (ptr @__const.bar.arr to ptr), i64 8, i1 false)
+ %fake.use = load [2 x i32], ptr %arr, align 4
+ notail call void (...) @llvm.fake.use([2 x i32] %fake.use)
+ ret void
+}
+
+declare void @llvm.memcpy.p0i8.p0i8.i64(ptr nocapture writeonly, ptr nocapture readonly, i64, i1 immarg)
diff --git a/llvm/test/CodeGen/X86/fake-use-suppress-load.ll b/llvm/test/CodeGen/X86/fake-use-suppress-load.ll
new file mode 100644
index 00000000000000..b49091333b0174
--- /dev/null
+++ b/llvm/test/CodeGen/X86/fake-use-suppress-load.ll
@@ -0,0 +1,23 @@
+; Suppress redundant loads feeding into fake uses.
+; RUN: llc -filetype=asm -o - %s --mtriple=x86_64-unknown-unknown | FileCheck %s
+; Windows ABI works differently, there's no offset.
+;
+; Look for the spill
+; CHECK: movq %r{{[a-z]+,}} -{{[0-9]+\(%rsp\)}}
+; CHECK-NOT: movq -{{[0-9]+\(%rsp\)}}, %r{{[a-z]+}}
+
+define dso_local i32 @f(ptr %p) local_unnamed_addr {
+entry:
+ call void asm sideeffect "", "~{rax},~{rbx},~{rcx},~{rdx},~{rsi},~{rdi},~{rbp},~{r8},~{r9},~{r10},~{r11},~{r12},~{r13},~{r14},~{r15},~{dirflag},~{fpsr},~{flags}"() #1, !srcloc !2
+ notail call void (...) @llvm.fake.use(ptr %p)
+ ret i32 4
+}
+
+declare void @llvm.fake.use(...) #1
+
+!llvm.module.flags = !{!0}
+!llvm.ident = !{!1}
+
+!0 = !{i32 1, !"wchar_size", i32 4}
+!1 = !{!"clang version 9.0.0"}
+!2 = !{i32 -2147471544}
diff --git a/llvm/test/CodeGen/X86/fake-use-tailcall.ll b/llvm/test/CodeGen/X86/fake-use-tailcall.ll
new file mode 100644
index 00000000000000..4aaea6c5eeb24d
--- /dev/null
+++ b/llvm/test/CodeGen/X86/fake-use-tailcall.ll
@@ -0,0 +1,23 @@
+; RUN: llc < %s -stop-after=finalize-isel -O2 - | FileCheck %s --implicit-check-not FAKE_USE
+; Fake uses following tail calls should be pulled in front
+; of the TCRETURN instruction. Fake uses using something defined by
+; the tail call or after it should be suppressed.
+
+; CHECK: body:
+; CHECK: bb.0.{{.*}}:
+; CHECK: %0:{{.*}}= COPY
+; CHECK: FAKE_USE %0
+; CHECK: TCRETURN
+
+define void @bar(i32 %v) {
+entry:
+ %call = tail call i32 @_Z3fooi(i32 %v)
+ %mul = mul nsw i32 %call, 3
+ notail call void (...) @llvm.fake.use(i32 %mul)
+ notail call void (...) @llvm.fake.use(i32 %call)
+ notail call void (...) @llvm.fake.use(i32 %v)
+ ret void
+}
+
+declare i32 @_Z3fooi(i32) local_unnamed_addr
+declare void @llvm.fake.use(...)
diff --git a/llvm/test/CodeGen/X86/fake-use-vector.ll b/llvm/test/CodeGen/X86/fake-use-vector.ll
new file mode 100644
index 00000000000000..699f3607da0af6
--- /dev/null
+++ b/llvm/test/CodeGen/X86/fake-use-vector.ll
@@ -0,0 +1,45 @@
+; assert in DAGlegalizer with fake use of 1-element vectors.
+; RUN: llc -stop-after=finalize-isel -filetype=asm -o - %s | FileCheck %s
+;
+; ModuleID = 't2.cpp'
+; source_filename = "t2.cpp"
+; target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
+;
+; Check that we get past ISel and generate FAKE_USE machine instructions for
+; one-element vectors.
+;
+; CHECK: bb.0.entry:
+; CHECK-DAG: %1:gr64 = COPY $rdi
+; CHECK-DAG: %0:vr128 = COPY $xmm0
+; CHECK: %2:vr64 =
+; CHECK-DAG: FAKE_USE %1
+; CHECK-DAG: FAKE_USE %0
+; CHECK: RET
+
+
+target triple = "x86_64-unknown-unknown"
+
+; Function Attrs: nounwind sspstrong uwtable
+define <4 x float> @_Z3runDv4_fDv1_x(<4 x float> %r, i64 %b.coerce) local_unnamed_addr #0 {
+entry:
+ %0 = insertelement <1 x i64> undef, i64 %b.coerce, i32 0
+ %1 = bitcast i64 %b.coerce to x86_mmx
+ %2 = tail call <4 x float> @llvm.x86.sse.cvtpi2ps(<4 x float> %r, x86_mmx %1)
+ tail call void (...) @llvm.fake.use(<1 x i64> %0)
+ tail call void (...) @llvm.fake.use(<4 x float> %r)
+ ret <4 x float> %2
+}
+
+; Function Attrs: nounwind readnone
+declare <4 x float> @llvm.x86.sse.cvtpi2ps(<4 x float>, x86_mmx)
+
+; Function Attrs: nounwind
+declare void @llvm.fake.use(...)
+
+attributes #0 = { "target-cpu"="btver2" }
+
+!llvm.module.flags = !{!0}
+!llvm.ident = !{!1}
+
+!0 = !{i32 1, !"PIC Level", i32 2}
+!1 = !{!"clang version 5.0.0"}
diff --git a/llvm/test/CodeGen/X86/fake-use-vector2.ll b/llvm/test/CodeGen/X86/fake-use-vector2.ll
new file mode 100644
index 00000000000000..08d50504a81b8d
--- /dev/null
+++ b/llvm/test/CodeGen/X86/fake-use-vector2.ll
@@ -0,0 +1,33 @@
+; RUN: llc -stop-after=finalize-isel -filetype=asm -o - %s | FileCheck %s
+;
+; Make sure we can split vectors that are used as operands of FAKE_USE.
+
+; Generated from:
+;
+; typedef long __attribute__((ext_vector_type(8))) long8;
+; void test0() { long8 id208 {0, 1, 2, 3, 4, 5, 6, 7}; }
+
+; ModuleID = 't5.cpp'
+source_filename = "t5.cpp"
+
+
+; CHECK: %0:vr256 = VMOV
+; CHECK: %1:vr256 = VMOV
+; CHECK-DAG: FAKE_USE killed %1
+; CHECK-DAG: FAKE_USE killed %0
+; CHECK: RET
+define void @_Z5test0v() local_unnamed_addr #0 {
+entry:
+ tail call void (...) @llvm.fake.use(<8 x i64> <i64 0, i64 1, i64 2, i64 3, i64 4, i64 5, i64 6, i64 7>) #1
+ ret void
+}
+
+declare void @llvm.fake.use(...)
+
+attributes #0 = { "target-cpu"="btver2" }
+
+!llvm.module.flags = !{!0}
+!llvm.ident = !{!1}
+
+!0 = !{i32 1, !"PIC Level", i32 2}
+!1 = !{!"clang version 5.0.0"}
diff --git a/llvm/test/DebugInfo/X86/Inputs/check-fake-use.py b/llvm/test/DebugInfo/X86/Inputs/check-fake-use.py
new file mode 100644
index 00000000000000..3ac6f7eb828cce
--- /dev/null
+++ b/llvm/test/DebugInfo/X86/Inputs/check-fake-use.py
@@ -0,0 +1,100 @@
+# Parsing dwarfdump's output to determine whether the location list for the
+# parameter "b" covers all of the function. The script is written in form of a
+# state machine and expects that dwarfdump output adheres to a certain order:
+# 1) The .debug_info section must appear before the .debug_loc section.
+# 2) The DW_AT_location attribute must appear before the parameter's name in the
+# formal parameter DIE.
+#
+import re
+import sys
+
+DebugInfoPattern = r"\.debug_info contents:"
+SubprogramPattern = r"^0x[0-9a-f]+:\s+DW_TAG_subprogram"
+HighPCPattern = r"DW_AT_high_pc.*0x([0-9a-f]+)"
+FormalPattern = r"^0x[0-9a-f]+:\s+DW_TAG_formal_parameter"
+LocationPattern = r"DW_AT_location\s+\[DW_FORM_sec_offset\].*0x([a-f0-9]+)"
+DebugLocPattern = r'\[0x([a-f0-9]+),\s+0x([a-f0-9]+)\) ".text":'
+
+# States
+LookingForDebugInfo = 0
+LookingForSubProgram = LookingForDebugInfo + 1 # 1
+LookingForHighPC = LookingForSubProgram + 1 # 2
+LookingForFormal = LookingForHighPC + 1 # 3
+LookingForLocation = LookingForFormal + 1 # 4
+DebugLocations = LookingForLocation + 1 # 5
+AllDone = DebugLocations + 1 # 6
+
+# For each state, the state table contains 3-item sublists with the following
+# entries:
+# 1) The regex pattern we use in each state.
+# 2) The state we enter when we have a successful match for the current pattern.
+# 3) The state we enter when we do not have a successful match for the
+# current pattern.
+StateTable = [
+ # LookingForDebugInfo
+ [DebugInfoPattern, LookingForSubProgram, LookingForDebugInfo],
+ # LookingForSubProgram
+ [SubprogramPattern, LookingForHighPC, LookingForSubProgram],
+ # LookingForHighPC
+ [HighPCPattern, LookingForFormal, LookingForHighPC],
+ # LookingForFormal
+ [FormalPattern, LookingForLocation, LookingForFormal],
+ # LookingForLocation
+ [LocationPattern, DebugLocations, LookingForFormal],
+ # DebugLocations
+ [DebugLocPattern, DebugLocations, AllDone],
+ # AllDone
+ [None, AllDone, AllDone],
+]
+
+# Symbolic indices
+StatePattern = 0
+NextState = 1
+FailState = 2
+
+State = LookingForDebugInfo
+FirstBeginOffset = -1
+
+# Read output from file provided as command arg
+with open(sys.argv[1], "r") as dwarf_dump_file:
+ for line in dwarf_dump_file:
+ if State == AllDone:
+ break
+ Pattern = StateTable[State][StatePattern]
+ # print "State: %d - Searching '%s' for '%s'" % (State, line, Pattern)
+ m = re.search(Pattern, line)
+ if m:
+ # Match. Depending on the state, we extract various values.
+ if State == LookingForHighPC:
+ HighPC = int(m.group(1), 16)
+ elif State == DebugLocations:
+ # Extract the range values
+ if FirstBeginOffset == -1:
+ FirstBeginOffset = int(m.group(1), 16)
+ # print "FirstBeginOffset set to %d" % FirstBeginOffset
+ EndOffset = int(m.group(2), 16)
+ # print "EndOffset set to %d" % EndOffset
+ State = StateTable[State][NextState]
+ else:
+ State = StateTable[State][FailState]
+
+Success = True
+
+# Check that the first entry start with 0 and that the last ending address
+# in our location list is close to the high pc of the subprogram.
+if State != AllDone:
+ print("Error in expected sequence of DWARF information:")
+ print(" State = %d\n" % State)
+ Success = False
+elif FirstBeginOffset == -1:
+ print("Location list for 'b' not found, did the debug info format change?")
+ Success = False
+elif FirstBeginOffset != 0 or abs(EndOffset - HighPC) > 16:
+ print("Location list for 'b' does not cover the whole function:")
+ print(
+ "Location starts at 0x%x, ends at 0x%x, HighPC = 0x%x"
+ % (FirstBeginOffset, EndOffset, HighPC)
+ )
+ Success = False
+
+sys.exit(not Success)
diff --git a/llvm/test/DebugInfo/X86/fake-use.ll b/llvm/test/DebugInfo/X86/fake-use.ll
new file mode 100644
index 00000000000000..3c4bef1cd7e75c
--- /dev/null
+++ b/llvm/test/DebugInfo/X86/fake-use.ll
@@ -0,0 +1,98 @@
+; REQUIRES: object-emission
+
+; Make sure the fake use of 'b' at the end of 'foo' causes location information for 'b'
+; to extend all the way to the end of the function.
+
+; RUN: %llc_dwarf -O2 -filetype=obj -dwarf-linkage-names=Abstract < %s | llvm-dwarfdump -v - -o %t
+; RUN: %python %p/Inputs/check-fake-use.py %t
+
+; Generated with:
+; clang -O2 -g -S -emit-llvm -fextend-this-ptr fake-use.c
+;
+; int glob[10];
+; extern void bar();
+;
+; int foo(int b, int i)
+; {
+; int loc = glob[i] * 2;
+; if (b) {
+; glob[2] = loc;
+; bar();
+; }
+; return loc;
+; }
+;
+; ModuleID = 't2.c'
+source_filename = "t2.c"
+
+ at glob = common local_unnamed_addr global [10 x i32] zeroinitializer, align 16, !dbg !0
+
+; Function Attrs: nounwind sspstrong uwtable
+define i32 @foo(i32 %b, i32 %i) local_unnamed_addr !dbg !13 {
+entry:
+ tail call void @llvm.dbg.value(metadata i32 %b, i64 0, metadata !17, metadata !20), !dbg !21
+ %idxprom = sext i32 %i to i64, !dbg !22
+ %arrayidx = getelementptr inbounds [10 x i32], [10 x i32]* @glob, i64 0, i64 %idxprom, !dbg !22
+ %0 = load i32, i32* %arrayidx, align 4, !dbg !22, !tbaa !23
+ %mul = shl nsw i32 %0, 1, !dbg !22
+ %tobool = icmp eq i32 %b, 0, !dbg !27
+ br i1 %tobool, label %if.end, label %if.then, !dbg !29
+
+if.then: ; preds = %entry
+ store i32 %mul, i32* getelementptr inbounds ([10 x i32], [10 x i32]* @glob, i64 0, i64 2), align 8, !dbg !30, !tbaa !23
+ tail call void (...) @bar() #2, !dbg !32
+ br label %if.end, !dbg !33
+
+if.end: ; preds = %entry, %if.then
+ tail call void (...) @llvm.fake.use(i32 %b), !dbg !34
+ ret i32 %mul, !dbg !35
+}
+
+declare void @bar(...) local_unnamed_addr
+
+; Function Attrs: nounwind
+declare void @llvm.fake.use(...)
+
+; Function Attrs: nounwind readnone
+declare void @llvm.dbg.value(metadata, i64, metadata, metadata)
+
+!llvm.dbg.cu = !{!1}
+!llvm.module.flags = !{!9, !10, !11}
+!llvm.ident = !{!12}
+
+!0 = distinct !DIGlobalVariableExpression(var: !DIGlobalVariable(name: "glob", scope: !1, file: !2, line: 1, type: !5, isLocal: false, isDefinition: true), expr: !DIExpression())
+!1 = distinct !DICompileUnit(language: DW_LANG_C99, file: !2, producer: "clang version 4.0.0", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, enums: !3, globals: !4)
+!2 = !DIFile(filename: "t2.c", directory: "/")
+!3 = !{}
+!4 = !{!0}
+!5 = !DICompositeType(tag: DW_TAG_array_type, baseType: !6, size: 320, align: 32, elements: !7)
+!6 = !DIBasicType(name: "int", size: 32, align: 32, encoding: DW_ATE_signed)
+!7 = !{!8}
+!8 = !DISubrange(count: 10)
+!9 = !{i32 2, !"Dwarf Version", i32 4}
+!10 = !{i32 2, !"Debug Info Version", i32 3}
+!11 = !{i32 1, !"PIC Level", i32 2}
+!12 = !{!"clang version 4.0.0"}
+!13 = distinct !DISubprogram(name: "foo", scope: !2, file: !2, line: 4, type: !14, isLocal: false, isDefinition: true, scopeLine: 5, flags: DIFlagPrototyped, isOptimized: true, unit: !1, retainedNodes: !16)
+!14 = !DISubroutineType(types: !15)
+!15 = !{!6, !6, !6}
+!16 = !{!17, !18, !19}
+!17 = !DILocalVariable(name: "b", arg: 1, scope: !13, file: !2, line: 4, type: !6)
+!18 = !DILocalVariable(name: "i", arg: 2, scope: !13, file: !2, line: 4, type: !6)
+!19 = !DILocalVariable(name: "loc", scope: !13, file: !2, line: 6, type: !6)
+!20 = !DIExpression()
+!21 = !DILocation(line: 4, scope: !13)
+!22 = !DILocation(line: 6, scope: !13)
+!23 = !{!24, !24, i64 0}
+!24 = !{!"int", !25, i64 0}
+!25 = !{!"omnipotent char", !26, i64 0}
+!26 = !{!"Simple C/C++ TBAA"}
+!27 = !DILocation(line: 7, scope: !28)
+!28 = distinct !DILexicalBlock(scope: !13, file: !2, line: 7)
+!29 = !DILocation(line: 7, scope: !13)
+!30 = !DILocation(line: 8, scope: !31)
+!31 = distinct !DILexicalBlock(scope: !28, file: !2, line: 7)
+!32 = !DILocation(line: 9, scope: !31)
+!33 = !DILocation(line: 10, scope: !31)
+!34 = !DILocation(line: 12, scope: !13)
+!35 = !DILocation(line: 11, scope: !13)
diff --git a/llvm/test/Transforms/GVN/fake-use-constprop.ll b/llvm/test/Transforms/GVN/fake-use-constprop.ll
new file mode 100644
index 00000000000000..11ed310083db17
--- /dev/null
+++ b/llvm/test/Transforms/GVN/fake-use-constprop.ll
@@ -0,0 +1,69 @@
+; RUN: opt -passes=gvn -S < %s | FileCheck %s
+;
+; The Global Value Numbering pass (GVN) propagates boolean values
+; that are constant in dominated basic blocks to all the uses
+; in these basic blocks. However, we don't want the constant propagated
+; into fake.use intrinsics since this would render the intrinsic useless
+; with respect to keeping the variable live up until the fake.use.
+; This test checks that we don't generate any fake.uses with constant 0.
+;
+; Reduced from the following test case, generated with clang -O2 -S -emit-llvm -fextend-lifetimes test.c
+;
+; extern void func1();
+; extern int bar();
+; extern void baz(int);
+;
+; int foo(int i, float f, int *punused)
+; {
+; int j = 3*i;
+; if (j > 0) {
+; int m = bar(i);
+; if (m) {
+; char b = f;
+; baz(b);
+; if (b)
+; goto lab;
+; func1();
+; }
+; lab:
+; func1();
+; }
+; return 1;
+; }
+
+;; GVN should propagate a constant value through to a regular call, but not to
+;; a fake use, which should continue to track the original value.
+; CHECK: %[[CONV_VAR:[a-zA-Z0-9]+]] = fptosi
+; CHECK: call {{.+}} @bees(i8 0)
+; CHECK: call {{.+}} @llvm.fake.use(i8 %[[CONV_VAR]])
+
+define i32 @foo(float %f) {
+ %conv = fptosi float %f to i8
+ %tobool3 = icmp eq i8 %conv, 0
+ br i1 %tobool3, label %if.end, label %lab
+
+if.end:
+ tail call void (...) @bees(i8 %conv)
+ tail call void (...) @llvm.fake.use(i8 %conv)
+ br label %lab
+
+lab:
+ ret i32 1
+}
+
+declare i32 @bar(...)
+
+declare void @baz(i32)
+
+declare void @bees(i32)
+
+declare void @func1(...)
+
+; Function Attrs: nounwind
+declare void @llvm.fake.use(...)
+
+!llvm.module.flags = !{!0}
+!llvm.ident = !{!1}
+
+!0 = !{i32 1, !"PIC Level", i32 2}
+!1 = !{!"clang version 3.9.0"}
>From 8452159600cc8bd8e0d734fbe0644ae708679850 Mon Sep 17 00:00:00 2001
From: Stephen Tozer <stephen.tozer at sony.com>
Date: Fri, 22 Mar 2024 12:48:56 +0000
Subject: [PATCH 02/20] Expand FAKE_USE tablegen comment to note special
handling
---
llvm/include/llvm/Target/Target.td | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/llvm/include/llvm/Target/Target.td b/llvm/include/llvm/Target/Target.td
index ee24e9b1ea7c98..b2eb250ae60b60 100644
--- a/llvm/include/llvm/Target/Target.td
+++ b/llvm/include/llvm/Target/Target.td
@@ -1419,7 +1419,9 @@ def FAULTING_OP : StandardPseudoInstruction {
let isBranch = true;
}
def FAKE_USE : StandardPseudoInstruction {
- // An instruction that uses its operands but does nothing.
+ // An instruction that uses its operands but does nothing; this instruction
+ // will be treated specially by CodeGen passes, distinguishing it from any
+ // otherwise equivalent instructions.
let OutOperandList = (outs);
let InOperandList = (ins variable_ops);
let AsmString = "FAKE_USE";
>From 6ef3630fa80d2ad3e8ca31763201eec397140a5b Mon Sep 17 00:00:00 2001
From: Stephen Tozer <stephen.tozer at sony.com>
Date: Tue, 4 Jun 2024 10:29:17 +0100
Subject: [PATCH 03/20] Fix globalisel stuff
---
llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp | 9 +
llvm/lib/CodeGen/GlobalISel/Utils.cpp | 3 +
llvm/lib/CodeGen/MachineSink.cpp | 2 +-
.../SelectionDAG/SelectionDAGBuilder.cpp | 25 ++-
.../CodeGen/MIR/X86/fake-use-scheduler.mir | 172 ++++++++++++++----
.../DebugInfo/X86/Inputs/check-fake-use.py | 39 ++--
llvm/test/DebugInfo/X86/fake-use.ll | 20 +-
.../GlobalISelCombinerEmitter/match-table.td | 54 +++---
8 files changed, 238 insertions(+), 86 deletions(-)
diff --git a/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp b/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
index f44af78cded46d..6ff8aa3bac9991 100644
--- a/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
@@ -2193,6 +2193,15 @@ bool IRTranslator::translateKnownIntrinsic(const CallInst &CI, Intrinsic::ID ID,
}
return true;
}
+ case Intrinsic::fake_use: {
+ SmallVector<llvm::SrcOp, 4> VRegs;
+ for (const auto &Arg : CI.args())
+ for (auto VReg : getOrCreateVRegs(*Arg))
+ VRegs.push_back(VReg);
+ MIRBuilder.buildInstr(TargetOpcode::FAKE_USE, std::nullopt, VRegs,
+ MachineInstr::copyFlagsFromInstruction(CI));
+ return true;
+ }
case Intrinsic::dbg_declare: {
const DbgDeclareInst &DI = cast<DbgDeclareInst>(CI);
assert(DI.getVariable() && "Missing variable");
diff --git a/llvm/lib/CodeGen/GlobalISel/Utils.cpp b/llvm/lib/CodeGen/GlobalISel/Utils.cpp
index cfdd9905c16fa6..b1270e7aeb875c 100644
--- a/llvm/lib/CodeGen/GlobalISel/Utils.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/Utils.cpp
@@ -228,6 +228,9 @@ bool llvm::isTriviallyDead(const MachineInstr &MI,
// Don't delete frame allocation labels.
if (MI.getOpcode() == TargetOpcode::LOCAL_ESCAPE)
return false;
+ // Don't delete fake uses.
+ if (MI.getOpcode() == TargetOpcode::FAKE_USE)
+ return false;
// LIFETIME markers should be preserved even if they seem dead.
if (MI.getOpcode() == TargetOpcode::LIFETIME_START ||
MI.getOpcode() == TargetOpcode::LIFETIME_END)
diff --git a/llvm/lib/CodeGen/MachineSink.cpp b/llvm/lib/CodeGen/MachineSink.cpp
index fe515ef5be541f..609f9af9767f5d 100644
--- a/llvm/lib/CodeGen/MachineSink.cpp
+++ b/llvm/lib/CodeGen/MachineSink.cpp
@@ -833,7 +833,7 @@ bool MachineSinking::ProcessBlock(MachineBasicBlock &MBB) {
if (!ProcessedBegin)
--I;
- if (MI.isDebugOrPseudoInstr()) {
+ if (MI.isDebugOrPseudoInstr() || MI.isFakeUse()) {
if (MI.isDebugValue())
ProcessDbgInst(MI);
continue;
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
index 24f4fdfca21941..20d0453c0992cb 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
@@ -1622,6 +1622,7 @@ bool SelectionDAGBuilder::handleDebugValue(ArrayRef<const Value *> Values,
SDValue N = NodeMap[V];
if (!N.getNode() && isa<Argument>(V)) // Check unused arguments map.
N = UnusedArgNodeMap[V];
+
if (N.getNode()) {
// Only emit func arg dbg value for non-variadic dbg.values for now.
if (!IsVariadic &&
@@ -7705,13 +7706,27 @@ void SelectionDAGBuilder::visitIntrinsicCall(const CallInst &I,
case Intrinsic::fake_use: {
Value *V = I.getArgOperand(0);
SDValue Ops[2];
- // If this fake use uses an argument that has an empty SDValue, it is a
- // zero-length array or some other type that does not produce a register,
- // so do not translate a fake use for it.
- if (isa<Argument>(V) && !NodeMap[V])
+ // For Values not declared or previously used in this basic block, the
+ // NodeMap will not have an entry, and `getValue` will assert if V has no
+ // valid register value.
+ auto FakeUseValue = [&]() -> SDValue {
+ SDValue &N = NodeMap[V];
+ if (N.getNode())
+ return N;
+
+ // If there's a virtual register allocated and initialized for this
+ // value, use it.
+ if (SDValue copyFromReg = getCopyFromRegs(V, V->getType()))
+ return copyFromReg;
+ // FIXME: Do we want to preserve constants? It seems pointless.
+ if (isa<Constant>(V))
+ return getValue(V);
+ return SDValue();
+ }();
+ if (!FakeUseValue || FakeUseValue.isUndef())
return;
Ops[0] = getRoot();
- Ops[1] = getValue(V);
+ Ops[1] = FakeUseValue;
// Also, do not translate a fake use with an undef operand, or any other
// empty SDValues.
if (!Ops[1] || Ops[1].isUndef())
diff --git a/llvm/test/CodeGen/MIR/X86/fake-use-scheduler.mir b/llvm/test/CodeGen/MIR/X86/fake-use-scheduler.mir
index bb2136e50d5b7e..7c134f2c127dd4 100644
--- a/llvm/test/CodeGen/MIR/X86/fake-use-scheduler.mir
+++ b/llvm/test/CodeGen/MIR/X86/fake-use-scheduler.mir
@@ -1,50 +1,73 @@
# Prevent the machine scheduler from moving instructions past FAKE_USE.
-# RUN: llc -run-pass machine-scheduler -o - %s | FileCheck %s
+# RUN: llc -run-pass machine-scheduler -debug-only=machine-scheduler 2>&1 -o - %s | FileCheck %s
+# REQUIRES: asserts
#
# We make sure that, beginning with the first FAKE_USE instruction,
# no changes to the sequence of instructions are undertaken by the
# scheduler. We don't bother to check that the order of the FAKE_USEs
# remains the same. They should, but it is irrelevant.
#
-# CHECK: bb.{{.*}}:
-# CHECK: FAKE_USE
-# CHECK-NEXT: FAKE_USE
-# CHECK-NEXT: FAKE_USE
-# CHECK-NEXT: FAKE_USE
-# CHECK-NEXT: COPY
-# CHECK-NEXT: RET
+# CHECK: ********** MI Scheduling **********
+# CHECK-NEXT: foo:%bb.0 entry
+# CHECK-NEXT: From: %0:gr64 = COPY $rdi
+# CHECK-NEXT: To: FAKE_USE %5:gr64
+# CHECK-NEXT: RegionInstrs: 7
+#
+# CHECK: ********** MI Scheduling **********
+# CHECK-NEXT: bar:%bb.0 entry
+# CHECK-NEXT: From: %0:gr64 = COPY $rdi
+# CHECK-NEXT: To: RET 0, killed $rax
+# CHECK-NEXT: RegionInstrs: 7
#
--- |
+ ; ModuleID = 'test.ll'
+ source_filename = "test.ll"
+ target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
+
@glb = common dso_local local_unnamed_addr global [100 x i32] zeroinitializer, align 16
-
- ; Function Attrs: nounwind uwtable
- define dso_local i64 @foo(i32* %p) local_unnamed_addr {
+
+ define dso_local i64 @foo(ptr %p) local_unnamed_addr {
entry:
- %0 = load i32, i32* getelementptr inbounds ([100 x i32], [100 x i32]* @glb, i64 0, i64 0), align 16, !tbaa !2
- store i32 %0, i32* %p, align 4, !tbaa !2
+ %0 = load i32, ptr @glb, align 16, !tbaa !2
+ store i32 %0, ptr %p, align 4, !tbaa !2
%conv = sext i32 %0 to i64
- %1 = load i32, i32* getelementptr inbounds ([100 x i32], [100 x i32]* @glb, i64 0, i64 1), align 4, !tbaa !2
- %arrayidx1 = getelementptr inbounds i32, i32* %p, i64 1
- store i32 %1, i32* %arrayidx1, align 4, !tbaa !2
+ %1 = load i32, ptr getelementptr inbounds ([100 x i32], ptr @glb, i64 0, i64 1), align 4, !tbaa !2
+ %arrayidx1 = getelementptr inbounds i32, ptr %p, i64 1
+ store i32 %1, ptr %arrayidx1, align 4, !tbaa !2
%conv2 = sext i32 %1 to i64
%add3 = add nsw i64 %conv2, %conv
notail call void (...) @llvm.fake.use(i64 %add3)
notail call void (...) @llvm.fake.use(i32 %1)
notail call void (...) @llvm.fake.use(i32 %0)
- notail call void (...) @llvm.fake.use(i32* %p)
+ notail call void (...) @llvm.fake.use(ptr %p)
ret i64 %add3
}
-
- ; Function Attrs: nounwind
- declare void @llvm.fake.use(...) #1
-
+
+ define dso_local i64 @bar(ptr %p) local_unnamed_addr {
+ entry:
+ %0 = load i32, ptr @glb, align 16, !tbaa !2
+ store i32 %0, ptr %p, align 4, !tbaa !2
+ %conv = sext i32 %0 to i64
+ %1 = load i32, ptr getelementptr inbounds ([100 x i32], ptr @glb, i64 0, i64 1), align 4, !tbaa !2
+ %arrayidx1 = getelementptr inbounds i32, ptr %p, i64 1
+ store i32 %1, ptr %arrayidx1, align 4, !tbaa !2
+ %conv2 = sext i32 %1 to i64
+ %add3 = add nsw i64 %conv2, %conv
+ ret i64 %add3
+ }
+
; Function Attrs: nounwind
- declare void @llvm.stackprotector(i8*, i8**) #1
-
-
+ declare void @llvm.fake.use(...) #0
+
+ ; Function Attrs: nocallback nofree nosync nounwind willreturn
+ declare void @llvm.stackprotector(ptr, ptr) #1
+
+ attributes #0 = { nounwind }
+ attributes #1 = { nocallback nofree nosync nounwind willreturn }
+
!llvm.module.flags = !{!0}
!llvm.ident = !{!1}
-
+
!0 = !{i32 1, !"wchar_size", i32 4}
!1 = !{!"clang version 9.0.0"}
!2 = !{!3, !3, i64 0}
@@ -55,7 +78,7 @@
...
---
name: foo
-alignment: 4
+alignment: 16
exposesReturnsTwice: false
legalized: false
regBankSelected: false
@@ -63,6 +86,15 @@ selected: false
failedISel: false
tracksRegLiveness: true
hasWinCFI: false
+callsEHReturn: false
+callsUnwindInit: false
+hasEHCatchret: false
+hasEHScopes: false
+hasEHFunclets: false
+isOutlined: false
+debugInstrRef: true
+failsVerification: false
+tracksDebugUserValues: false
registers:
- { id: 0, class: gr64, preferred-register: '' }
- { id: 1, class: gr64_with_sub_8bit, preferred-register: '' }
@@ -79,30 +111,36 @@ frameInfo:
hasPatchPoint: false
stackSize: 0
offsetAdjustment: 0
- maxAlignment: 0
+ maxAlignment: 1
adjustsStack: false
hasCalls: false
stackProtector: ''
+ functionContext: ''
maxCallFrameSize: 4294967295
cvBytesOfCalleeSavedRegisters: 0
hasOpaqueSPAdjustment: false
hasVAStart: false
hasMustTailInVarArgFunc: false
+ hasTailCall: false
localFrameSize: 0
savePoint: ''
restorePoint: ''
fixedStack: []
stack: []
+entry_values: []
+callSites: []
+debugValueSubstitutions: []
constants: []
+machineFunctionInfo: {}
body: |
bb.0.entry:
liveins: $rdi
-
+
%0:gr64 = COPY $rdi
- %1:gr64_with_sub_8bit = MOVSX64rm32 $rip, 1, $noreg, @glb, $noreg :: (dereferenceable load 4 from `i32* getelementptr inbounds ([100 x i32], [100 x i32]* @glb, i64 0, i64 0)`, align 16, !tbaa !2)
- MOV32mr %0, 1, $noreg, 0, $noreg, %1.sub_32bit :: (store 4 into %ir.p, !tbaa !2)
- %3:gr64_with_sub_8bit = MOVSX64rm32 $rip, 1, $noreg, @glb + 4, $noreg :: (dereferenceable load 4 from `i32* getelementptr inbounds ([100 x i32], [100 x i32]* @glb, i64 0, i64 1)`, !tbaa !2)
- MOV32mr %0, 1, $noreg, 4, $noreg, %3.sub_32bit :: (store 4 into %ir.arrayidx1, !tbaa !2)
+ %1:gr64_with_sub_8bit = MOVSX64rm32 $rip, 1, $noreg, @glb, $noreg :: (dereferenceable load (s32) from @glb, align 16, !tbaa !2)
+ MOV32mr %0, 1, $noreg, 0, $noreg, %1.sub_32bit :: (store (s32) into %ir.p, !tbaa !2)
+ %3:gr64_with_sub_8bit = MOVSX64rm32 $rip, 1, $noreg, @glb + 4, $noreg :: (dereferenceable load (s32) from `ptr getelementptr inbounds ([100 x i32], ptr @glb, i64 0, i64 1)`, !tbaa !2)
+ MOV32mr %0, 1, $noreg, 4, $noreg, %3.sub_32bit :: (store (s32) into %ir.arrayidx1, !tbaa !2)
%5:gr64 = COPY %3
%5:gr64 = nsw ADD64rr %5, %1, implicit-def dead $eflags
FAKE_USE %5
@@ -113,3 +151,73 @@ body: |
RET 0, killed $rax
...
+---
+name: bar
+alignment: 16
+exposesReturnsTwice: false
+legalized: false
+regBankSelected: false
+selected: false
+failedISel: false
+tracksRegLiveness: true
+hasWinCFI: false
+callsEHReturn: false
+callsUnwindInit: false
+hasEHCatchret: false
+hasEHScopes: false
+hasEHFunclets: false
+isOutlined: false
+debugInstrRef: true
+failsVerification: false
+tracksDebugUserValues: false
+registers:
+ - { id: 0, class: gr64, preferred-register: '' }
+ - { id: 1, class: gr64_with_sub_8bit, preferred-register: '' }
+ - { id: 2, class: gr32, preferred-register: '' }
+ - { id: 3, class: gr64_with_sub_8bit, preferred-register: '' }
+ - { id: 4, class: gr32, preferred-register: '' }
+ - { id: 5, class: gr64_with_sub_8bit, preferred-register: '' }
+liveins:
+ - { reg: '$rdi', virtual-reg: '%0' }
+frameInfo:
+ isFrameAddressTaken: false
+ isReturnAddressTaken: false
+ hasStackMap: false
+ hasPatchPoint: false
+ stackSize: 0
+ offsetAdjustment: 0
+ maxAlignment: 1
+ adjustsStack: false
+ hasCalls: false
+ stackProtector: ''
+ functionContext: ''
+ maxCallFrameSize: 4294967295
+ cvBytesOfCalleeSavedRegisters: 0
+ hasOpaqueSPAdjustment: false
+ hasVAStart: false
+ hasMustTailInVarArgFunc: false
+ hasTailCall: false
+ localFrameSize: 0
+ savePoint: ''
+ restorePoint: ''
+fixedStack: []
+stack: []
+entry_values: []
+callSites: []
+debugValueSubstitutions: []
+constants: []
+machineFunctionInfo: {}
+body: |
+ bb.0.entry:
+ liveins: $rdi
+
+ %0:gr64 = COPY $rdi
+ %1:gr64_with_sub_8bit = MOVSX64rm32 $rip, 1, $noreg, @glb, $noreg :: (dereferenceable load (s32) from @glb, align 16, !tbaa !2)
+ MOV32mr %0, 1, $noreg, 0, $noreg, %1.sub_32bit :: (store (s32) into %ir.p, !tbaa !2)
+ %5:gr64_with_sub_8bit = MOVSX64rm32 $rip, 1, $noreg, @glb + 4, $noreg :: (dereferenceable load (s32) from `ptr getelementptr inbounds ([100 x i32], ptr @glb, i64 0, i64 1)`, !tbaa !2)
+ MOV32mr %0, 1, $noreg, 4, $noreg, %5.sub_32bit :: (store (s32) into %ir.arrayidx1, !tbaa !2)
+ %5:gr64_with_sub_8bit = nsw ADD64rr %5, %1, implicit-def dead $eflags
+ $rax = COPY %5
+ RET 0, killed $rax
+
+...
diff --git a/llvm/test/DebugInfo/X86/Inputs/check-fake-use.py b/llvm/test/DebugInfo/X86/Inputs/check-fake-use.py
index 3ac6f7eb828cce..531d1d5deb1e19 100644
--- a/llvm/test/DebugInfo/X86/Inputs/check-fake-use.py
+++ b/llvm/test/DebugInfo/X86/Inputs/check-fake-use.py
@@ -10,7 +10,8 @@
DebugInfoPattern = r"\.debug_info contents:"
SubprogramPattern = r"^0x[0-9a-f]+:\s+DW_TAG_subprogram"
-HighPCPattern = r"DW_AT_high_pc.*0x([0-9a-f]+)"
+ProloguePattern = r"^\s*0x([0-9a-f]+)\s.+prologue_end"
+EpiloguePattern = r"^\s*0x([0-9a-f]+)\s.+epilogue_begin"
FormalPattern = r"^0x[0-9a-f]+:\s+DW_TAG_formal_parameter"
LocationPattern = r"DW_AT_location\s+\[DW_FORM_sec_offset\].*0x([a-f0-9]+)"
DebugLocPattern = r'\[0x([a-f0-9]+),\s+0x([a-f0-9]+)\) ".text":'
@@ -18,11 +19,12 @@
# States
LookingForDebugInfo = 0
LookingForSubProgram = LookingForDebugInfo + 1 # 1
-LookingForHighPC = LookingForSubProgram + 1 # 2
-LookingForFormal = LookingForHighPC + 1 # 3
-LookingForLocation = LookingForFormal + 1 # 4
-DebugLocations = LookingForLocation + 1 # 5
-AllDone = DebugLocations + 1 # 6
+LookingForFormal = LookingForSubProgram + 1 # 2
+LookingForLocation = LookingForFormal + 1 # 3
+DebugLocations = LookingForLocation + 1 # 4
+LookingForPrologue = DebugLocations + 1 # 5
+LookingForEpilogue = LookingForPrologue + 1 # 6
+AllDone = LookingForEpilogue + 1 # 7
# For each state, the state table contains 3-item sublists with the following
# entries:
@@ -34,15 +36,17 @@
# LookingForDebugInfo
[DebugInfoPattern, LookingForSubProgram, LookingForDebugInfo],
# LookingForSubProgram
- [SubprogramPattern, LookingForHighPC, LookingForSubProgram],
- # LookingForHighPC
- [HighPCPattern, LookingForFormal, LookingForHighPC],
+ [SubprogramPattern, LookingForFormal, LookingForSubProgram],
# LookingForFormal
[FormalPattern, LookingForLocation, LookingForFormal],
# LookingForLocation
[LocationPattern, DebugLocations, LookingForFormal],
# DebugLocations
- [DebugLocPattern, DebugLocations, AllDone],
+ [DebugLocPattern, DebugLocations, LookingForPrologue],
+ # LookingForPrologue
+ [ProloguePattern, LookingForEpilogue, LookingForPrologue],
+ # LookingForEpilogue
+ [EpiloguePattern, AllDone, LookingForEpilogue],
# AllDone
[None, AllDone, AllDone],
]
@@ -61,19 +65,18 @@
if State == AllDone:
break
Pattern = StateTable[State][StatePattern]
- # print "State: %d - Searching '%s' for '%s'" % (State, line, Pattern)
m = re.search(Pattern, line)
if m:
# Match. Depending on the state, we extract various values.
- if State == LookingForHighPC:
- HighPC = int(m.group(1), 16)
+ if State == LookingForPrologue:
+ PrologueEnd = int(m.group(1), 16)
+ elif State == LookingForEpilogue:
+ EpilogueBegin = int(m.group(1), 16)
elif State == DebugLocations:
# Extract the range values
if FirstBeginOffset == -1:
FirstBeginOffset = int(m.group(1), 16)
- # print "FirstBeginOffset set to %d" % FirstBeginOffset
EndOffset = int(m.group(2), 16)
- # print "EndOffset set to %d" % EndOffset
State = StateTable[State][NextState]
else:
State = StateTable[State][FailState]
@@ -89,11 +92,11 @@
elif FirstBeginOffset == -1:
print("Location list for 'b' not found, did the debug info format change?")
Success = False
-elif FirstBeginOffset != 0 or abs(EndOffset - HighPC) > 16:
+elif FirstBeginOffset > PrologueEnd or EndOffset < EpilogueBegin:
print("Location list for 'b' does not cover the whole function:")
print(
- "Location starts at 0x%x, ends at 0x%x, HighPC = 0x%x"
- % (FirstBeginOffset, EndOffset, HighPC)
+ "Prologue to Epilogue = [0x%x, 0x%x), Location range = [0x%x, 0x%x)"
+ % (PrologueEnd, EpilogueBegin, FirstBeginOffset, EndOffset)
)
Success = False
diff --git a/llvm/test/DebugInfo/X86/fake-use.ll b/llvm/test/DebugInfo/X86/fake-use.ll
index 3c4bef1cd7e75c..dab7c92d428049 100644
--- a/llvm/test/DebugInfo/X86/fake-use.ll
+++ b/llvm/test/DebugInfo/X86/fake-use.ll
@@ -2,9 +2,20 @@
; Make sure the fake use of 'b' at the end of 'foo' causes location information for 'b'
; to extend all the way to the end of the function.
+; FIXME: We need to use `-debugger-tune=sce` to disable entry values, since otherwise
+; %b will be preserved as one; there doesn't seem to be another way to disable entry
+; values (--debug-entry-values is only used to force-enable them, not disable them).
-; RUN: %llc_dwarf -O2 -filetype=obj -dwarf-linkage-names=Abstract < %s | llvm-dwarfdump -v - -o %t
+; RUN: %llc_dwarf -O2 -debugger-tune=sce -filetype=obj -dwarf-linkage-names=Abstract < %s | llvm-dwarfdump --debug-info --debug-line -v - -o %t
; RUN: %python %p/Inputs/check-fake-use.py %t
+; RUN: %llc_dwarf -O2 -debugger-tune=sce --global-isel=1 -mtriple=aarch64--linux-gnu -filetype=obj -dwarf-linkage-names=Abstract < %s | llvm-dwarfdump --debug-info --debug-line -v - -o %t
+; RUN: %python %p/Inputs/check-fake-use.py %t
+; RUN: sed -e 's,call void (...) @llvm.fake.use,;,' %s | %llc_dwarf - -O2 -debugger-tune=sce -filetype=obj -dwarf-linkage-names=Abstract | llvm-dwarfdump --debug-info --debug-line -v - -o %t
+; RUN: not %python %p/Inputs/check-fake-use.py %t
+; RUN: sed -e 's,call void (...) @llvm.fake.use,;,' %s \
+; RUN: | %llc_dwarf - -O2 -debugger-tune=sce --global-isel=1 -mtriple=aarch64--linux-gnu -filetype=obj -dwarf-linkage-names=Abstract \
+; RUN: | llvm-dwarfdump --debug-info --debug-line -v - -o %t
+; RUN: not %python %p/Inputs/check-fake-use.py %t
; Generated with:
; clang -O2 -g -S -emit-llvm -fextend-this-ptr fake-use.c
@@ -31,11 +42,14 @@ source_filename = "t2.c"
define i32 @foo(i32 %b, i32 %i) local_unnamed_addr !dbg !13 {
entry:
tail call void @llvm.dbg.value(metadata i32 %b, i64 0, metadata !17, metadata !20), !dbg !21
+ %c = add i32 %b, 42
+ %tobool = icmp sgt i32 %c, 2, !dbg !27
+ tail call void asm sideeffect "", "~{rax},~{rbx},~{rcx},~{rdx},~{rsi},~{rdi},~{rbp},~{r8},~{r9},~{r10},~{r11},~{r12},~{r13},~{r14},~{r15},~{dirflag},~{fpsr},~{flags}"()
+ tail call void (...) @bar() #2, !dbg !32
%idxprom = sext i32 %i to i64, !dbg !22
%arrayidx = getelementptr inbounds [10 x i32], [10 x i32]* @glob, i64 0, i64 %idxprom, !dbg !22
%0 = load i32, i32* %arrayidx, align 4, !dbg !22, !tbaa !23
%mul = shl nsw i32 %0, 1, !dbg !22
- %tobool = icmp eq i32 %b, 0, !dbg !27
br i1 %tobool, label %if.end, label %if.then, !dbg !29
if.then: ; preds = %entry
@@ -44,7 +58,7 @@ if.then: ; preds = %entry
br label %if.end, !dbg !33
if.end: ; preds = %entry, %if.then
- tail call void (...) @llvm.fake.use(i32 %b), !dbg !34
+ call void (...) @llvm.fake.use(i32 %b), !dbg !34
ret i32 %mul, !dbg !35
}
diff --git a/llvm/test/TableGen/GlobalISelCombinerEmitter/match-table.td b/llvm/test/TableGen/GlobalISelCombinerEmitter/match-table.td
index b52849b6bc931d..00601b7ae6e0d2 100644
--- a/llvm/test/TableGen/GlobalISelCombinerEmitter/match-table.td
+++ b/llvm/test/TableGen/GlobalISelCombinerEmitter/match-table.td
@@ -136,14 +136,14 @@ def MyCombiner: GICombiner<"GenMyCombiner", [
// CHECK: const uint8_t *GenMyCombiner::getMatchTable() const {
// CHECK-NEXT: constexpr static uint8_t MatchTable0[] = {
// CHECK-NEXT: GIM_SwitchOpcode, /*MI*/0, /*[*/GIMT_Encode2([[#LOWER:]]), GIMT_Encode2([[#UPPER:]]), /*)*//*default:*//*Label 6*/ GIMT_Encode4([[#DEFAULT:]]),
-// CHECK-NEXT: /*TargetOpcode::COPY*//*Label 0*/ GIMT_Encode4(470), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0),
-// CHECK-NEXT: /*TargetOpcode::G_AND*//*Label 1*/ GIMT_Encode4(506), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0),
-// CHECK-NEXT: /*TargetOpcode::G_STORE*//*Label 2*/ GIMT_Encode4(553), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0),
-// CHECK-NEXT: /*TargetOpcode::G_TRUNC*//*Label 3*/ GIMT_Encode4(587), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0),
-// CHECK-NEXT: /*TargetOpcode::G_SEXT*//*Label 4*/ GIMT_Encode4(610), GIMT_Encode4(0),
-// CHECK-NEXT: /*TargetOpcode::G_ZEXT*//*Label 5*/ GIMT_Encode4(622),
+// CHECK-NEXT: /*TargetOpcode::COPY*//*Label 0*/ GIMT_Encode4(474), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0),
+// CHECK-NEXT: /*TargetOpcode::G_AND*//*Label 1*/ GIMT_Encode4(510), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0),
+// CHECK-NEXT: /*TargetOpcode::G_STORE*//*Label 2*/ GIMT_Encode4(557), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0),
+// CHECK-NEXT: /*TargetOpcode::G_TRUNC*//*Label 3*/ GIMT_Encode4(591), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0), GIMT_Encode4(0),
+// CHECK-NEXT: /*TargetOpcode::G_SEXT*//*Label 4*/ GIMT_Encode4(614), GIMT_Encode4(0),
+// CHECK-NEXT: /*TargetOpcode::G_ZEXT*//*Label 5*/ GIMT_Encode4(626),
// CHECK-NEXT: // Label 0: @[[#%u, mul(UPPER-LOWER, 4) + 10]]
-// CHECK-NEXT: GIM_Try, /*On fail goto*//*Label 7*/ GIMT_Encode4(494), // Rule ID 4 //
+// CHECK-NEXT: GIM_Try, /*On fail goto*//*Label 7*/ GIMT_Encode4(498), // Rule ID 4 //
// CHECK-NEXT: GIM_CheckFeatures, GIMT_Encode2(GIFBS_HasAnswerToEverything),
// CHECK-NEXT: GIM_CheckSimplePredicate, GIMT_Encode2(GICXXPred_Simple_IsRule3Enabled),
// CHECK-NEXT: // MIs[0] a
@@ -156,8 +156,8 @@ def MyCombiner: GICombiner<"GenMyCombiner", [
// CHECK-NEXT: GIM_CheckIsSafeToFold, /*NumInsns*/1,
// CHECK-NEXT: // Combiner Rule #3: InstTest1
// CHECK-NEXT: GIR_DoneWithCustomAction, /*Fn*/GIMT_Encode2(GICXXCustomAction_GICombiner2),
-// CHECK-NEXT: // Label 7: @494
-// CHECK-NEXT: GIM_Try, /*On fail goto*//*Label 8*/ GIMT_Encode4(505), // Rule ID 3 //
+// CHECK-NEXT: // Label 7: @498
+// CHECK-NEXT: GIM_Try, /*On fail goto*//*Label 8*/ GIMT_Encode4(509), // Rule ID 3 //
// CHECK-NEXT: GIM_CheckSimplePredicate, GIMT_Encode2(GICXXPred_Simple_IsRule2Enabled),
// CHECK-NEXT: // MIs[0] a
// CHECK-NEXT: // No operand predicates
@@ -165,10 +165,10 @@ def MyCombiner: GICombiner<"GenMyCombiner", [
// CHECK-NEXT: // No operand predicates
// CHECK-NEXT: // Combiner Rule #2: InstTest0
// CHECK-NEXT: GIR_DoneWithCustomAction, /*Fn*/GIMT_Encode2(GICXXCustomAction_GICombiner1),
-// CHECK-NEXT: // Label 8: @505
+// CHECK-NEXT: // Label 8: @509
// CHECK-NEXT: GIM_Reject,
-// CHECK-NEXT: // Label 1: @506
-// CHECK-NEXT: GIM_Try, /*On fail goto*//*Label 9*/ GIMT_Encode4(552), // Rule ID 6 //
+// CHECK-NEXT: // Label 1: @510
+// CHECK-NEXT: GIM_Try, /*On fail goto*//*Label 9*/ GIMT_Encode4(556), // Rule ID 6 //
// CHECK-NEXT: GIM_CheckSimplePredicate, GIMT_Encode2(GICXXPred_Simple_IsRule5Enabled),
// CHECK-NEXT: GIM_RootCheckType, /*Op*/2, /*Type*/GILLT_s32,
// CHECK-NEXT: // MIs[0] dst
@@ -185,10 +185,10 @@ def MyCombiner: GICombiner<"GenMyCombiner", [
// CHECK-NEXT: GIR_RootToRootCopy, /*OpIdx*/0, // dst
// CHECK-NEXT: GIR_Copy, /*NewInsnID*/0, /*OldInsnID*/1, /*OpIdx*/1, // z
// CHECK-NEXT: GIR_EraseRootFromParent_Done,
-// CHECK-NEXT: // Label 9: @552
+// CHECK-NEXT: // Label 9: @556
// CHECK-NEXT: GIM_Reject,
-// CHECK-NEXT: // Label 2: @553
-// CHECK-NEXT: GIM_Try, /*On fail goto*//*Label 10*/ GIMT_Encode4(586), // Rule ID 5 //
+// CHECK-NEXT: // Label 2: @557
+// CHECK-NEXT: GIM_Try, /*On fail goto*//*Label 10*/ GIMT_Encode4(590), // Rule ID 5 //
// CHECK-NEXT: GIM_CheckSimplePredicate, GIMT_Encode2(GICXXPred_Simple_IsRule4Enabled),
// CHECK-NEXT: // MIs[0] tmp
// CHECK-NEXT: GIM_RecordInsnIgnoreCopies, /*DefineMI*/1, /*MI*/0, /*OpIdx*/0, // MIs[1]
@@ -204,29 +204,29 @@ def MyCombiner: GICombiner<"GenMyCombiner", [
// CHECK-NEXT: GIR_RootToRootCopy, /*OpIdx*/1, // ptr
// CHECK-NEXT: GIR_MergeMemOperands, /*InsnID*/0, /*NumInsns*/2, /*MergeInsnID's*/0, 1,
// CHECK-NEXT: GIR_EraseRootFromParent_Done,
-// CHECK-NEXT: // Label 10: @586
+// CHECK-NEXT: // Label 10: @590
// CHECK-NEXT: GIM_Reject,
-// CHECK-NEXT: // Label 3: @587
-// CHECK-NEXT: GIM_Try, /*On fail goto*//*Label 11*/ GIMT_Encode4(598), // Rule ID 0 //
+// CHECK-NEXT: // Label 3: @591
+// CHECK-NEXT: GIM_Try, /*On fail goto*//*Label 11*/ GIMT_Encode4(602), // Rule ID 0 //
// CHECK-NEXT: GIM_CheckSimplePredicate, GIMT_Encode2(GICXXPred_Simple_IsRule0Enabled),
// CHECK-NEXT: // Combiner Rule #0: WipOpcodeTest0; wip_match_opcode 'G_TRUNC'
// CHECK-NEXT: GIR_DoneWithCustomAction, /*Fn*/GIMT_Encode2(GICXXCustomAction_GICombiner0),
-// CHECK-NEXT: // Label 11: @598
-// CHECK-NEXT: GIM_Try, /*On fail goto*//*Label 12*/ GIMT_Encode4(609), // Rule ID 1 //
+// CHECK-NEXT: // Label 11: @602
+// CHECK-NEXT: GIM_Try, /*On fail goto*//*Label 12*/ GIMT_Encode4(613), // Rule ID 1 //
// CHECK-NEXT: GIM_CheckSimplePredicate, GIMT_Encode2(GICXXPred_Simple_IsRule1Enabled),
// CHECK-NEXT: // Combiner Rule #1: WipOpcodeTest1; wip_match_opcode 'G_TRUNC'
// CHECK-NEXT: GIR_DoneWithCustomAction, /*Fn*/GIMT_Encode2(GICXXCustomAction_GICombiner0),
-// CHECK-NEXT: // Label 12: @609
+// CHECK-NEXT: // Label 12: @613
// CHECK-NEXT: GIM_Reject,
-// CHECK-NEXT: // Label 4: @610
-// CHECK-NEXT: GIM_Try, /*On fail goto*//*Label 13*/ GIMT_Encode4(621), // Rule ID 2 //
+// CHECK-NEXT: // Label 4: @614
+// CHECK-NEXT: GIM_Try, /*On fail goto*//*Label 13*/ GIMT_Encode4(625), // Rule ID 2 //
// CHECK-NEXT: GIM_CheckSimplePredicate, GIMT_Encode2(GICXXPred_Simple_IsRule1Enabled),
// CHECK-NEXT: // Combiner Rule #1: WipOpcodeTest1; wip_match_opcode 'G_SEXT'
// CHECK-NEXT: GIR_DoneWithCustomAction, /*Fn*/GIMT_Encode2(GICXXCustomAction_GICombiner0),
-// CHECK-NEXT: // Label 13: @621
+// CHECK-NEXT: // Label 13: @625
// CHECK-NEXT: GIM_Reject,
-// CHECK-NEXT: // Label 5: @622
-// CHECK-NEXT: GIM_Try, /*On fail goto*//*Label 14*/ GIMT_Encode4(656), // Rule ID 7 //
+// CHECK-NEXT: // Label 5: @626
+// CHECK-NEXT: GIM_Try, /*On fail goto*//*Label 14*/ GIMT_Encode4(660), // Rule ID 7 //
// CHECK-NEXT: GIM_CheckSimplePredicate, GIMT_Encode2(GICXXPred_Simple_IsRule6Enabled),
// CHECK-NEXT: // MIs[0] dst
// CHECK-NEXT: // No operand predicates
@@ -240,7 +240,7 @@ def MyCombiner: GICombiner<"GenMyCombiner", [
// CHECK-NEXT: GIR_RootToRootCopy, /*OpIdx*/0, // dst
// CHECK-NEXT: GIR_AddSimpleTempRegister, /*InsnID*/0, /*TempRegID*/0,
// CHECK-NEXT: GIR_EraseRootFromParent_Done,
-// CHECK-NEXT: // Label 14: @656
+// CHECK-NEXT: // Label 14: @660
// CHECK-NEXT: GIM_Reject,
// CHECK-NEXT: // Label 6: @[[#%u, DEFAULT]]
// CHECK-NEXT: GIM_Reject,
>From 8e5d34248a2adf507e9d08cf04855f91481ad84e Mon Sep 17 00:00:00 2001
From: Stephen Tozer <stephen.tozer at sony.com>
Date: Tue, 4 Jun 2024 10:29:42 +0100
Subject: [PATCH 04/20] Python3-ify the check-fake-use script
---
.../DebugInfo/X86/Inputs/check-fake-use.py | 60 ++++++++++++-------
1 file changed, 38 insertions(+), 22 deletions(-)
diff --git a/llvm/test/DebugInfo/X86/Inputs/check-fake-use.py b/llvm/test/DebugInfo/X86/Inputs/check-fake-use.py
index 531d1d5deb1e19..600671ad088bac 100644
--- a/llvm/test/DebugInfo/X86/Inputs/check-fake-use.py
+++ b/llvm/test/DebugInfo/X86/Inputs/check-fake-use.py
@@ -1,3 +1,5 @@
+#!/usr/bin/python3
+
# Parsing dwarfdump's output to determine whether the location list for the
# parameter "b" covers all of the function. The script is written in form of a
# state machine and expects that dwarfdump output adheres to a certain order:
@@ -8,6 +10,8 @@
import re
import sys
+from enum import IntEnum, auto
+
DebugInfoPattern = r"\.debug_info contents:"
SubprogramPattern = r"^0x[0-9a-f]+:\s+DW_TAG_subprogram"
ProloguePattern = r"^\s*0x([0-9a-f]+)\s.+prologue_end"
@@ -17,14 +21,15 @@
DebugLocPattern = r'\[0x([a-f0-9]+),\s+0x([a-f0-9]+)\) ".text":'
# States
-LookingForDebugInfo = 0
-LookingForSubProgram = LookingForDebugInfo + 1 # 1
-LookingForFormal = LookingForSubProgram + 1 # 2
-LookingForLocation = LookingForFormal + 1 # 3
-DebugLocations = LookingForLocation + 1 # 4
-LookingForPrologue = DebugLocations + 1 # 5
-LookingForEpilogue = LookingForPrologue + 1 # 6
-AllDone = LookingForEpilogue + 1 # 7
+class States(IntEnum):
+ LookingForDebugInfo = 0
+ LookingForSubProgram = auto()
+ LookingForFormal = auto()
+ LookingForLocation = auto()
+ DebugLocations = auto()
+ LookingForPrologue = auto()
+ LookingForEpilogue = auto()
+ AllDone = auto()
# For each state, the state table contains 3-item sublists with the following
# entries:
@@ -34,21 +39,21 @@
# current pattern.
StateTable = [
# LookingForDebugInfo
- [DebugInfoPattern, LookingForSubProgram, LookingForDebugInfo],
+ [DebugInfoPattern, States.LookingForSubProgram, States.LookingForDebugInfo],
# LookingForSubProgram
- [SubprogramPattern, LookingForFormal, LookingForSubProgram],
+ [SubprogramPattern, States.LookingForFormal, States.LookingForSubProgram],
# LookingForFormal
- [FormalPattern, LookingForLocation, LookingForFormal],
+ [FormalPattern, States.LookingForLocation, States.LookingForFormal],
# LookingForLocation
- [LocationPattern, DebugLocations, LookingForFormal],
+ [LocationPattern, States.DebugLocations, States.LookingForFormal],
# DebugLocations
- [DebugLocPattern, DebugLocations, LookingForPrologue],
+ [DebugLocPattern, States.DebugLocations, States.LookingForPrologue],
# LookingForPrologue
- [ProloguePattern, LookingForEpilogue, LookingForPrologue],
+ [ProloguePattern, States.LookingForEpilogue, States.LookingForPrologue],
# LookingForEpilogue
- [EpiloguePattern, AllDone, LookingForEpilogue],
+ [EpiloguePattern, States.AllDone, States.LookingForEpilogue],
# AllDone
- [None, AllDone, AllDone],
+ [None, States.AllDone, States.AllDone],
]
# Symbolic indices
@@ -56,26 +61,34 @@
NextState = 1
FailState = 2
-State = LookingForDebugInfo
+State = States.LookingForDebugInfo
FirstBeginOffset = -1
+ContinuousLocation = True
+LocationBreak = ()
+LocationRanges = []
# Read output from file provided as command arg
with open(sys.argv[1], "r") as dwarf_dump_file:
for line in dwarf_dump_file:
- if State == AllDone:
+ if State == States.AllDone:
break
Pattern = StateTable[State][StatePattern]
m = re.search(Pattern, line)
if m:
# Match. Depending on the state, we extract various values.
- if State == LookingForPrologue:
+ if State == States.LookingForPrologue:
PrologueEnd = int(m.group(1), 16)
- elif State == LookingForEpilogue:
+ elif State == States.LookingForEpilogue:
EpilogueBegin = int(m.group(1), 16)
- elif State == DebugLocations:
+ elif State == States.DebugLocations:
# Extract the range values
if FirstBeginOffset == -1:
FirstBeginOffset = int(m.group(1), 16)
+ else:
+ NewBeginOffset = int(m.group(1), 16)
+ if NewBeginOffset != EndOffset:
+ ContinuousLocation = False
+ LocationBreak = (EndOffset, NewBeginOffset)
EndOffset = int(m.group(2), 16)
State = StateTable[State][NextState]
else:
@@ -85,13 +98,16 @@
# Check that the first entry start with 0 and that the last ending address
# in our location list is close to the high pc of the subprogram.
-if State != AllDone:
+if State != States.AllDone:
print("Error in expected sequence of DWARF information:")
print(" State = %d\n" % State)
Success = False
elif FirstBeginOffset == -1:
print("Location list for 'b' not found, did the debug info format change?")
Success = False
+elif not ContinuousLocation:
+ print("Location list for 'b' is discontinuous from [0x%x, 0x%x)" % LocationBreak)
+ Success = False
elif FirstBeginOffset > PrologueEnd or EndOffset < EpilogueBegin:
print("Location list for 'b' does not cover the whole function:")
print(
>From f46db66fec6de831f9f13150c7dc3bee4f94c3bd Mon Sep 17 00:00:00 2001
From: Stephen Tozer <stephen.tozer at sony.com>
Date: Tue, 4 Jun 2024 13:10:10 +0100
Subject: [PATCH 05/20] Pass TII into isLoadFeedingIntoFakeUse
---
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
index 3d21f30ef4dc72..6e0bbbc76b1258 100644
--- a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
@@ -1075,9 +1075,8 @@ void AsmPrinter::emitFunctionEntryLabel() {
// Recognize cases where a spilled register is reloaded solely to feed into a
// FAKE_USE.
-static bool isLoadFeedingIntoFakeUse(const MachineInstr &MI) {
+static bool isLoadFeedingIntoFakeUse(const MachineInstr &MI, const TargetInstrInfo *TII) {
const MachineFunction *MF = MI.getMF();
- const TargetInstrInfo *TII = MF->getSubtarget().getInstrInfo();
// If the restore size is std::nullopt then we are not dealing with a reload
// of a spilled register.
@@ -1110,7 +1109,7 @@ static void emitComments(const MachineInstr &MI, raw_ostream &CommentOS) {
// instruction, meaning the load value has no effect on the program and has
// only been kept alive for debugging; since it is still available on the
// stack, we can skip the load itself.
- if (isLoadFeedingIntoFakeUse(MI))
+ if (isLoadFeedingIntoFakeUse(MI, TII))
return;
// Check for spills and reloads
@@ -1748,6 +1747,7 @@ void AsmPrinter::emitFunctionBody() {
bool HasAnyRealCode = false;
int NumInstsInFunction = 0;
bool IsEHa = MMI->getModule()->getModuleFlag("eh-asynch");
+ const TargetInstrInfo *TII = MF->getSubtarget().getInstrInfo();
bool CanDoExtraAnalysis = ORE->allowExtraAnalysis(DEBUG_TYPE);
for (auto &MBB : *MF) {
@@ -1855,7 +1855,7 @@ void AsmPrinter::emitFunctionBody() {
// FAKE_USE instruction, meaning the load value has no effect on the
// program and has only been kept alive for debugging; since it is
// still available on the stack, we can skip the load itself.
- if (isLoadFeedingIntoFakeUse(MI))
+ if (isLoadFeedingIntoFakeUse(MI, TII))
break;
emitInstruction(&MI);
if (CanDoExtraAnalysis) {
>From 8b23eb357806656a72081deeb5b4ff6db3fd9cf0 Mon Sep 17 00:00:00 2001
From: Stephen Tozer <stephen.tozer at sony.com>
Date: Thu, 6 Jun 2024 12:48:24 +0100
Subject: [PATCH 06/20] Remove unnecessary copyFlagsFromInstruction
---
llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp b/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
index 6ff8aa3bac9991..968d0a2a5c75e4 100644
--- a/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
@@ -2198,8 +2198,7 @@ bool IRTranslator::translateKnownIntrinsic(const CallInst &CI, Intrinsic::ID ID,
for (const auto &Arg : CI.args())
for (auto VReg : getOrCreateVRegs(*Arg))
VRegs.push_back(VReg);
- MIRBuilder.buildInstr(TargetOpcode::FAKE_USE, std::nullopt, VRegs,
- MachineInstr::copyFlagsFromInstruction(CI));
+ MIRBuilder.buildInstr(TargetOpcode::FAKE_USE, std::nullopt, VRegs);
return true;
}
case Intrinsic::dbg_declare: {
>From 424be4fe2c7d65e4591781159ad8c86e241ca5fe Mon Sep 17 00:00:00 2001
From: Stephen Tozer <stephen.tozer at sony.com>
Date: Thu, 6 Jun 2024 15:03:30 +0100
Subject: [PATCH 07/20] Update tests
---
llvm/test/CodeGen/MIR/X86/fake-use-phi.mir | 95 -------------------
llvm/test/CodeGen/X86/fake-use-hpfloat.ll | 2 +-
.../CodeGen/X86/fake-use-suppress-load.ll | 11 +--
llvm/test/CodeGen/X86/fake-use-tailcall.ll | 17 +++-
.../{MIR => }/X86/fake-use-zero-length.ll | 7 --
.../CodeGenPrepare/X86/fake-use-phi.ll | 50 ++++++++++
.../CodeGenPrepare}/X86/fake-use-split-ret.ll | 14 ---
.../X86 => Transforms/SROA}/fake-use-sroa.ll | 0
8 files changed, 69 insertions(+), 127 deletions(-)
delete mode 100644 llvm/test/CodeGen/MIR/X86/fake-use-phi.mir
rename llvm/test/CodeGen/{MIR => }/X86/fake-use-zero-length.ll (87%)
create mode 100644 llvm/test/Transforms/CodeGenPrepare/X86/fake-use-phi.ll
rename llvm/test/{CodeGen => Transforms/CodeGenPrepare}/X86/fake-use-split-ret.ll (71%)
rename llvm/test/{CodeGen/X86 => Transforms/SROA}/fake-use-sroa.ll (100%)
diff --git a/llvm/test/CodeGen/MIR/X86/fake-use-phi.mir b/llvm/test/CodeGen/MIR/X86/fake-use-phi.mir
deleted file mode 100644
index b8571304d870d8..00000000000000
--- a/llvm/test/CodeGen/MIR/X86/fake-use-phi.mir
+++ /dev/null
@@ -1,95 +0,0 @@
-# RUN: llc < %s -x mir -run-pass=codegenprepare | FileCheck %s --implicit-check-not="llvm.fake.use"
-#
-# When performing return duplication to enable
-# tail call optimization we clone fake uses that exist in the to-be-eliminated
-# return block into the predecessor blocks. When doing this with fake uses
-# of PHI-nodes, they cannot be easily copied, but require the correct operand.
-# We are currently not able to do this correctly, so we suppress the cloning
-# of such fake uses at the moment.
-#
-# There should be no fake use of a call result in any of the resulting return
-# blocks.
-
-
-# CHECK: declare void @llvm.fake.use
-
-# Fake uses of `this` should be duplicated into both return blocks.
-# CHECK: if.then:
-# CHECK: @llvm.fake.use({{.*}}this
-# CHECK: if.else:
-# CHECK: @llvm.fake.use({{.*}}this
-
---- |
- source_filename = "test.ll"
-
- %class.a = type { i8 }
-
- declare void @llvm.fake.use(...)
- declare i32 @foo(ptr nonnull dereferenceable(1)) local_unnamed_addr
- declare i32 @bar(ptr nonnull dereferenceable(1)) local_unnamed_addr
-
- define hidden void @func(ptr nonnull dereferenceable(1) %this) local_unnamed_addr align 2 {
- entry:
- %b = getelementptr inbounds %class.a, ptr %this, i64 0, i32 0
- %0 = load i8, i8* %b, align 1
- %tobool.not = icmp eq i8 %0, 0
- br i1 %tobool.not, label %if.else, label %if.then
-
- if.then: ; preds = %entry
- %call = tail call i32 @foo(ptr nonnull dereferenceable(1) %this)
- %call2 = tail call i32 @bar(ptr nonnull dereferenceable(1) %this)
- br label %if.end
-
- if.else: ; preds = %entry
- %call4 = tail call i32 @bar(ptr nonnull dereferenceable(1) %this)
- %call5 = tail call i32 @foo(ptr nonnull dereferenceable(1) %this)
- br label %if.end
-
- if.end: ; preds = %if.else, %if.then
- %call4.sink = phi i32 [ %call4, %if.else ], [ %call, %if.then ]
- notail call void (...) @llvm.fake.use(i32 %call4.sink)
- notail call void (...) @llvm.fake.use(ptr nonnull %this)
- ret void
- }
-
-...
----
-name: func
-alignment: 16
-exposesReturnsTwice: false
-legalized: false
-regBankSelected: false
-selected: false
-failedISel: false
-tracksRegLiveness: true
-hasWinCFI: false
-registers: []
-liveins: []
-frameInfo:
- isFrameAddressTaken: false
- isReturnAddressTaken: false
- hasStackMap: false
- hasPatchPoint: false
- stackSize: 0
- offsetAdjustment: 0
- maxAlignment: 1
- adjustsStack: false
- hasCalls: false
- stackProtector: ''
- maxCallFrameSize: 4294967295
- cvBytesOfCalleeSavedRegisters: 0
- hasOpaqueSPAdjustment: false
- hasVAStart: false
- hasMustTailInVarArgFunc: false
- localFrameSize: 0
- savePoint: ''
- restorePoint: ''
-fixedStack: []
-stack: []
-callSites: []
-debugValueSubstitutions: []
-constants: []
-machineFunctionInfo: {}
-body: |
-
-...
diff --git a/llvm/test/CodeGen/X86/fake-use-hpfloat.ll b/llvm/test/CodeGen/X86/fake-use-hpfloat.ll
index 9ec53fb9558c55..74ab8f9b76e5c9 100644
--- a/llvm/test/CodeGen/X86/fake-use-hpfloat.ll
+++ b/llvm/test/CodeGen/X86/fake-use-hpfloat.ll
@@ -1,6 +1,6 @@
; assert in DAGlegalizer with fake use of half precision float.
; Changes to half float promotion.
-; RUN: llc -O2 -stop-after=finalize-isel -filetype=asm -o - %s | FileCheck %s
+; RUN: llc -stop-after=finalize-isel -o - %s | FileCheck %s
;
; CHECK: bb.0.entry:
; CHECK-NEXT: %0:fr16 = FsFLD0SH
diff --git a/llvm/test/CodeGen/X86/fake-use-suppress-load.ll b/llvm/test/CodeGen/X86/fake-use-suppress-load.ll
index b49091333b0174..9f18bac0847cc3 100644
--- a/llvm/test/CodeGen/X86/fake-use-suppress-load.ll
+++ b/llvm/test/CodeGen/X86/fake-use-suppress-load.ll
@@ -8,16 +8,9 @@
define dso_local i32 @f(ptr %p) local_unnamed_addr {
entry:
- call void asm sideeffect "", "~{rax},~{rbx},~{rcx},~{rdx},~{rsi},~{rdi},~{rbp},~{r8},~{r9},~{r10},~{r11},~{r12},~{r13},~{r14},~{r15},~{dirflag},~{fpsr},~{flags}"() #1, !srcloc !2
+ call void asm sideeffect "", "~{rax},~{rbx},~{rcx},~{rdx},~{rsi},~{rdi},~{rbp},~{r8},~{r9},~{r10},~{r11},~{r12},~{r13},~{r14},~{r15},~{dirflag},~{fpsr},~{flags}"() #1
notail call void (...) @llvm.fake.use(ptr %p)
ret i32 4
}
-declare void @llvm.fake.use(...) #1
-
-!llvm.module.flags = !{!0}
-!llvm.ident = !{!1}
-
-!0 = !{i32 1, !"wchar_size", i32 4}
-!1 = !{!"clang version 9.0.0"}
-!2 = !{i32 -2147471544}
+declare void @llvm.fake.use(...)
diff --git a/llvm/test/CodeGen/X86/fake-use-tailcall.ll b/llvm/test/CodeGen/X86/fake-use-tailcall.ll
index 4aaea6c5eeb24d..0b91172fb2fd7f 100644
--- a/llvm/test/CodeGen/X86/fake-use-tailcall.ll
+++ b/llvm/test/CodeGen/X86/fake-use-tailcall.ll
@@ -1,8 +1,16 @@
-; RUN: llc < %s -stop-after=finalize-isel -O2 - | FileCheck %s --implicit-check-not FAKE_USE
+; RUN: llc < %s -stop-after=finalize-isel - | FileCheck %s --implicit-check-not FAKE_USE
; Fake uses following tail calls should be pulled in front
; of the TCRETURN instruction. Fake uses using something defined by
; the tail call or after it should be suppressed.
+; CHECK: name:{{ +}}bar
+; CHECK: body:
+; CHECK: bb.0.{{.*}}:
+; CHECK: %0:{{.*}}= COPY
+; CHECK: FAKE_USE %0
+; CHECK: TCRETURN
+
+; CHECK: name:{{ +}}baz
; CHECK: body:
; CHECK: bb.0.{{.*}}:
; CHECK: %0:{{.*}}= COPY
@@ -19,5 +27,12 @@ entry:
ret void
}
+define i32 @baz(i32 %v) {
+entry:
+ %call = tail call i32 @_Z3fooi(i32 %v)
+ notail call void (...) @llvm.fake.use(i32 %v)
+ ret i32 %call
+}
+
declare i32 @_Z3fooi(i32) local_unnamed_addr
declare void @llvm.fake.use(...)
diff --git a/llvm/test/CodeGen/MIR/X86/fake-use-zero-length.ll b/llvm/test/CodeGen/X86/fake-use-zero-length.ll
similarity index 87%
rename from llvm/test/CodeGen/MIR/X86/fake-use-zero-length.ll
rename to llvm/test/CodeGen/X86/fake-use-zero-length.ll
index b23d74814882ab..83bf674e2e6f48 100644
--- a/llvm/test/CodeGen/MIR/X86/fake-use-zero-length.ll
+++ b/llvm/test/CodeGen/X86/fake-use-zero-length.ll
@@ -31,10 +31,3 @@ declare void @baz([1 x i32] %a)
; Function Attrs: nounwind
declare void @llvm.fake.use(...)
-
-!llvm.module.flags = !{!0, !1}
-!llvm.ident = !{!2}
-
-!0 = !{i32 1, !"wchar_size", i32 2}
-!1 = !{i32 7, !"PIC Level", i32 2}
-!2 = !{!"clang version 10.0.0"}
\ No newline at end of file
diff --git a/llvm/test/Transforms/CodeGenPrepare/X86/fake-use-phi.ll b/llvm/test/Transforms/CodeGenPrepare/X86/fake-use-phi.ll
new file mode 100644
index 00000000000000..ede651ee621f9b
--- /dev/null
+++ b/llvm/test/Transforms/CodeGenPrepare/X86/fake-use-phi.ll
@@ -0,0 +1,50 @@
+; RUN: opt < %s -passes='require<profile-summary>,function(codegenprepare)' -S -mtriple=x86_64 | FileCheck %s --implicit-check-not="llvm.fake.use"
+;
+; When performing return duplication to enable
+; tail call optimization we clone fake uses that exist in the to-be-eliminated
+; return block into the predecessor blocks. When doing this with fake uses
+; of PHI-nodes, they cannot be easily copied, but require the correct operand.
+; We are currently not able to do this correctly, so we suppress the cloning
+; of such fake uses at the moment.
+;
+; There should be no fake use of a call result in any of the resulting return
+; blocks.
+
+; Fake uses of `this` should be duplicated into both return blocks.
+; CHECK: if.then:
+; CHECK: @llvm.fake.use({{.*}}this
+; CHECK: if.else:
+; CHECK: @llvm.fake.use({{.*}}this
+
+; CHECK: declare void @llvm.fake.use
+
+source_filename = "test.ll"
+
+%class.a = type { i8 }
+
+declare i32 @foo(ptr nonnull dereferenceable(1)) local_unnamed_addr
+declare i32 @bar(ptr nonnull dereferenceable(1)) local_unnamed_addr
+
+define hidden void @func(ptr nonnull dereferenceable(1) %this) local_unnamed_addr align 2 {
+entry:
+ %b = getelementptr inbounds %class.a, ptr %this, i64 0, i32 0
+ %0 = load i8, i8* %b, align 1
+ %tobool.not = icmp eq i8 %0, 0
+ br i1 %tobool.not, label %if.else, label %if.then
+
+if.then: ; preds = %entry
+ %call = tail call i32 @foo(ptr nonnull dereferenceable(1) %this)
+ %call2 = tail call i32 @bar(ptr nonnull dereferenceable(1) %this)
+ br label %if.end
+
+if.else: ; preds = %entry
+ %call4 = tail call i32 @bar(ptr nonnull dereferenceable(1) %this)
+ %call5 = tail call i32 @foo(ptr nonnull dereferenceable(1) %this)
+ br label %if.end
+
+if.end: ; preds = %if.else, %if.then
+ %call4.sink = phi i32 [ %call4, %if.else ], [ %call, %if.then ]
+ notail call void (...) @llvm.fake.use(i32 %call4.sink)
+ notail call void (...) @llvm.fake.use(ptr nonnull %this)
+ ret void
+}
diff --git a/llvm/test/CodeGen/X86/fake-use-split-ret.ll b/llvm/test/Transforms/CodeGenPrepare/X86/fake-use-split-ret.ll
similarity index 71%
rename from llvm/test/CodeGen/X86/fake-use-split-ret.ll
rename to llvm/test/Transforms/CodeGenPrepare/X86/fake-use-split-ret.ll
index eff11105723142..9c6805ea31c0d4 100644
--- a/llvm/test/CodeGen/X86/fake-use-split-ret.ll
+++ b/llvm/test/Transforms/CodeGenPrepare/X86/fake-use-split-ret.ll
@@ -15,17 +15,10 @@
; return bar(i);
;}
-; ModuleID = 'test.cpp'
-source_filename = "test.cpp"
-target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
-target triple = "x86_64-unknown-unknown"
-
declare i32 @_Z3bari(i32) local_unnamed_addr
-; Function Attrs: nounwind
declare void @llvm.fake.use(...)
-; Function Attrs: nounwind sspstrong uwtable
define i32 @_Z4foo2i(i32 %i) local_unnamed_addr {
entry:
%dec = add nsw i32 %i, -1
@@ -44,10 +37,3 @@ cleanup: ; preds = %entry, %if.end
; CHECK: ret i32 -1
ret i32 %retval.0
}
-
-!llvm.module.flags = !{!0, !1}
-!llvm.ident = !{!2}
-
-!0 = !{i32 1, !"wchar_size", i32 2}
-!1 = !{i32 7, !"PIC Level", i32 2}
-!2 = !{!"clang version 7.0.0"}
diff --git a/llvm/test/CodeGen/X86/fake-use-sroa.ll b/llvm/test/Transforms/SROA/fake-use-sroa.ll
similarity index 100%
rename from llvm/test/CodeGen/X86/fake-use-sroa.ll
rename to llvm/test/Transforms/SROA/fake-use-sroa.ll
>From a93c657884d28f60ad78f1e03248b7e0da5b0757 Mon Sep 17 00:00:00 2001
From: Stephen Tozer <stephen.tozer at sony.com>
Date: Thu, 6 Jun 2024 19:59:21 +0100
Subject: [PATCH 08/20] Fix scheduler, update comment in AsmPrinter
---
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp | 7 +-
llvm/test/CodeGen/X86/fake-use-scheduler.mir | 129 +++++++++++++++++++
2 files changed, 132 insertions(+), 4 deletions(-)
create mode 100644 llvm/test/CodeGen/X86/fake-use-scheduler.mir
diff --git a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
index 6e0bbbc76b1258..268aa3736a38f8 100644
--- a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
@@ -1105,10 +1105,9 @@ static void emitComments(const MachineInstr &MI, raw_ostream &CommentOS) {
const MachineFunction *MF = MI.getMF();
const TargetInstrInfo *TII = MF->getSubtarget().getInstrInfo();
- // If this is a reload of a spilled register that only feeds into a FAKE_USE
- // instruction, meaning the load value has no effect on the program and has
- // only been kept alive for debugging; since it is still available on the
- // stack, we can skip the load itself.
+ // When we're loading a value that is only going to be used by a FAKE_USE, we
+ // will skip emitting it in AsmPrinter::emitFunctionBody, so we also skip
+ // emitting a comment for it here.
if (isLoadFeedingIntoFakeUse(MI, TII))
return;
diff --git a/llvm/test/CodeGen/X86/fake-use-scheduler.mir b/llvm/test/CodeGen/X86/fake-use-scheduler.mir
new file mode 100644
index 00000000000000..b2871f4b06ccdb
--- /dev/null
+++ b/llvm/test/CodeGen/X86/fake-use-scheduler.mir
@@ -0,0 +1,129 @@
+# Prevent the machine scheduler from moving instructions past FAKE_USE.
+# RUN: llc -run-pass machine-scheduler -debug-only=machine-scheduler 2>&1 -o - %s | FileCheck %s
+# REQUIRES: asserts
+#
+# We make sure that, beginning with the first FAKE_USE instruction,
+# no changes to the sequence of instructions are undertaken by the
+# scheduler. We don't bother to check that the order of the FAKE_USEs
+# remains the same. They should, but it is irrelevant.
+#
+# CHECK: ********** MI Scheduling **********
+# CHECK-NEXT: foo:%bb.0 entry
+# CHECK-NEXT: From: %0:gr64 = COPY $rdi
+# CHECK-NEXT: To: FAKE_USE %5:gr64
+# CHECK-NEXT: RegionInstrs: 7
+#
+# CHECK: ********** MI Scheduling **********
+# CHECK-NEXT: bar:%bb.0 entry
+# CHECK-NEXT: From: %0:gr64 = COPY $rdi
+# CHECK-NEXT: To: RET 0, killed $rax
+# CHECK-NEXT: RegionInstrs: 7
+#
+--- |
+ ; ModuleID = 'test.ll'
+ source_filename = "test.ll"
+ target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
+
+ @glb = common dso_local local_unnamed_addr global [100 x i32] zeroinitializer, align 16
+
+ define dso_local i64 @foo(ptr %p) local_unnamed_addr {
+ entry:
+ %0 = load i32, ptr @glb, align 16
+ store i32 %0, ptr %p, align 4
+ %conv = sext i32 %0 to i64
+ %1 = load i32, ptr getelementptr inbounds ([100 x i32], ptr @glb, i64 0, i64 1), align 4
+ %arrayidx1 = getelementptr inbounds i32, ptr %p, i64 1
+ store i32 %1, ptr %arrayidx1, align 4
+ %conv2 = sext i32 %1 to i64
+ %add3 = add nsw i64 %conv2, %conv
+ notail call void (...) @llvm.fake.use(i64 %add3)
+ notail call void (...) @llvm.fake.use(i32 %1)
+ notail call void (...) @llvm.fake.use(i32 %0)
+ notail call void (...) @llvm.fake.use(ptr %p)
+ ret i64 %add3
+ }
+
+ define dso_local i64 @bar(ptr %p) local_unnamed_addr {
+ entry:
+ %0 = load i32, ptr @glb, align 16
+ store i32 %0, ptr %p, align 4
+ %conv = sext i32 %0 to i64
+ %1 = load i32, ptr getelementptr inbounds ([100 x i32], ptr @glb, i64 0, i64 1), align 4
+ %arrayidx1 = getelementptr inbounds i32, ptr %p, i64 1
+ store i32 %1, ptr %arrayidx1, align 4
+ %conv2 = sext i32 %1 to i64
+ %add3 = add nsw i64 %conv2, %conv
+ ret i64 %add3
+ }
+
+ ; Function Attrs: nounwind
+ declare void @llvm.fake.use(...) #0
+
+ ; Function Attrs: nocallback nofree nosync nounwind willreturn
+ declare void @llvm.stackprotector(ptr, ptr) #1
+
+ attributes #0 = { nounwind }
+ attributes #1 = { nocallback nofree nosync nounwind willreturn }
+
+...
+---
+name: foo
+alignment: 16
+tracksRegLiveness: true
+debugInstrRef: true
+registers:
+ - { id: 0, class: gr64, preferred-register: '' }
+ - { id: 1, class: gr64_with_sub_8bit, preferred-register: '' }
+ - { id: 2, class: gr32, preferred-register: '' }
+ - { id: 3, class: gr64_with_sub_8bit, preferred-register: '' }
+ - { id: 4, class: gr32, preferred-register: '' }
+ - { id: 5, class: gr64, preferred-register: '' }
+liveins:
+ - { reg: '$rdi', virtual-reg: '%0' }
+body: |
+ bb.0.entry:
+ liveins: $rdi
+
+ %0:gr64 = COPY $rdi
+ %1:gr64_with_sub_8bit = MOVSX64rm32 $rip, 1, $noreg, @glb, $noreg
+ MOV32mr %0, 1, $noreg, 0, $noreg, %1.sub_32bit
+ %3:gr64_with_sub_8bit = MOVSX64rm32 $rip, 1, $noreg, @glb + 4, $noreg
+ MOV32mr %0, 1, $noreg, 4, $noreg, %3.sub_32bit
+ %5:gr64 = COPY %3
+ %5:gr64 = nsw ADD64rr %5, %1, implicit-def dead $eflags
+ FAKE_USE %5
+ FAKE_USE %3.sub_32bit
+ FAKE_USE %1.sub_32bit
+ FAKE_USE %0
+ $rax = COPY %5
+ RET 0, killed $rax
+
+...
+---
+name: bar
+alignment: 16
+tracksRegLiveness: true
+debugInstrRef: true
+registers:
+ - { id: 0, class: gr64, preferred-register: '' }
+ - { id: 1, class: gr64_with_sub_8bit, preferred-register: '' }
+ - { id: 2, class: gr32, preferred-register: '' }
+ - { id: 3, class: gr64_with_sub_8bit, preferred-register: '' }
+ - { id: 4, class: gr32, preferred-register: '' }
+ - { id: 5, class: gr64_with_sub_8bit, preferred-register: '' }
+liveins:
+ - { reg: '$rdi', virtual-reg: '%0' }
+body: |
+ bb.0.entry:
+ liveins: $rdi
+
+ %0:gr64 = COPY $rdi
+ %1:gr64_with_sub_8bit = MOVSX64rm32 $rip, 1, $noreg, @glb, $noreg
+ MOV32mr %0, 1, $noreg, 0, $noreg, %1.sub_32bit
+ %5:gr64_with_sub_8bit = MOVSX64rm32 $rip, 1, $noreg, @glb + 4, $noreg
+ MOV32mr %0, 1, $noreg, 4, $noreg, %5.sub_32bit
+ %5:gr64_with_sub_8bit = nsw ADD64rr %5, %1, implicit-def dead $eflags
+ $rax = COPY %5
+ RET 0, killed $rax
+
+...
>From f830b70f944f0fdbd3e01397e9f2be03f5a6b3cc Mon Sep 17 00:00:00 2001
From: Stephen Tozer <stephen.tozer at sony.com>
Date: Fri, 7 Jun 2024 15:30:33 +0100
Subject: [PATCH 09/20] Remove dangling test, add 'optdebug' to existing fake
use tests
---
.../CodeGen/MIR/X86/fake-use-scheduler.mir | 223 ------------------
.../CodeGen/MIR/X86/fake-use-tailcall.mir | 9 +-
llvm/test/CodeGen/X86/fake-use-hpfloat.ll | 4 +-
llvm/test/CodeGen/X86/fake-use-ld.ll | 10 +-
llvm/test/CodeGen/X86/fake-use-scheduler.mir | 12 +-
.../CodeGen/X86/fake-use-simple-tail-call.ll | 11 +-
.../CodeGen/X86/fake-use-suppress-load.ll | 4 +-
llvm/test/CodeGen/X86/fake-use-tailcall.ll | 5 +-
llvm/test/CodeGen/X86/fake-use-vector.ll | 8 +-
llvm/test/CodeGen/X86/fake-use-vector2.ll | 8 +-
llvm/test/CodeGen/X86/fake-use-zero-length.ll | 5 +-
llvm/test/DebugInfo/X86/fake-use.ll | 11 +-
.../CodeGenPrepare/X86/fake-use-phi.ll | 2 +-
.../CodeGenPrepare/X86/fake-use-split-ret.ll | 4 +-
.../test/Transforms/GVN/fake-use-constprop.ll | 11 +-
llvm/test/Transforms/SROA/fake-use-sroa.ll | 6 +-
16 files changed, 20 insertions(+), 313 deletions(-)
delete mode 100644 llvm/test/CodeGen/MIR/X86/fake-use-scheduler.mir
diff --git a/llvm/test/CodeGen/MIR/X86/fake-use-scheduler.mir b/llvm/test/CodeGen/MIR/X86/fake-use-scheduler.mir
deleted file mode 100644
index 7c134f2c127dd4..00000000000000
--- a/llvm/test/CodeGen/MIR/X86/fake-use-scheduler.mir
+++ /dev/null
@@ -1,223 +0,0 @@
-# Prevent the machine scheduler from moving instructions past FAKE_USE.
-# RUN: llc -run-pass machine-scheduler -debug-only=machine-scheduler 2>&1 -o - %s | FileCheck %s
-# REQUIRES: asserts
-#
-# We make sure that, beginning with the first FAKE_USE instruction,
-# no changes to the sequence of instructions are undertaken by the
-# scheduler. We don't bother to check that the order of the FAKE_USEs
-# remains the same. They should, but it is irrelevant.
-#
-# CHECK: ********** MI Scheduling **********
-# CHECK-NEXT: foo:%bb.0 entry
-# CHECK-NEXT: From: %0:gr64 = COPY $rdi
-# CHECK-NEXT: To: FAKE_USE %5:gr64
-# CHECK-NEXT: RegionInstrs: 7
-#
-# CHECK: ********** MI Scheduling **********
-# CHECK-NEXT: bar:%bb.0 entry
-# CHECK-NEXT: From: %0:gr64 = COPY $rdi
-# CHECK-NEXT: To: RET 0, killed $rax
-# CHECK-NEXT: RegionInstrs: 7
-#
---- |
- ; ModuleID = 'test.ll'
- source_filename = "test.ll"
- target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
-
- @glb = common dso_local local_unnamed_addr global [100 x i32] zeroinitializer, align 16
-
- define dso_local i64 @foo(ptr %p) local_unnamed_addr {
- entry:
- %0 = load i32, ptr @glb, align 16, !tbaa !2
- store i32 %0, ptr %p, align 4, !tbaa !2
- %conv = sext i32 %0 to i64
- %1 = load i32, ptr getelementptr inbounds ([100 x i32], ptr @glb, i64 0, i64 1), align 4, !tbaa !2
- %arrayidx1 = getelementptr inbounds i32, ptr %p, i64 1
- store i32 %1, ptr %arrayidx1, align 4, !tbaa !2
- %conv2 = sext i32 %1 to i64
- %add3 = add nsw i64 %conv2, %conv
- notail call void (...) @llvm.fake.use(i64 %add3)
- notail call void (...) @llvm.fake.use(i32 %1)
- notail call void (...) @llvm.fake.use(i32 %0)
- notail call void (...) @llvm.fake.use(ptr %p)
- ret i64 %add3
- }
-
- define dso_local i64 @bar(ptr %p) local_unnamed_addr {
- entry:
- %0 = load i32, ptr @glb, align 16, !tbaa !2
- store i32 %0, ptr %p, align 4, !tbaa !2
- %conv = sext i32 %0 to i64
- %1 = load i32, ptr getelementptr inbounds ([100 x i32], ptr @glb, i64 0, i64 1), align 4, !tbaa !2
- %arrayidx1 = getelementptr inbounds i32, ptr %p, i64 1
- store i32 %1, ptr %arrayidx1, align 4, !tbaa !2
- %conv2 = sext i32 %1 to i64
- %add3 = add nsw i64 %conv2, %conv
- ret i64 %add3
- }
-
- ; Function Attrs: nounwind
- declare void @llvm.fake.use(...) #0
-
- ; Function Attrs: nocallback nofree nosync nounwind willreturn
- declare void @llvm.stackprotector(ptr, ptr) #1
-
- attributes #0 = { nounwind }
- attributes #1 = { nocallback nofree nosync nounwind willreturn }
-
- !llvm.module.flags = !{!0}
- !llvm.ident = !{!1}
-
- !0 = !{i32 1, !"wchar_size", i32 4}
- !1 = !{!"clang version 9.0.0"}
- !2 = !{!3, !3, i64 0}
- !3 = !{!"int", !4, i64 0}
- !4 = !{!"omnipotent char", !5, i64 0}
- !5 = !{!"Simple C/C++ TBAA"}
-
-...
----
-name: foo
-alignment: 16
-exposesReturnsTwice: false
-legalized: false
-regBankSelected: false
-selected: false
-failedISel: false
-tracksRegLiveness: true
-hasWinCFI: false
-callsEHReturn: false
-callsUnwindInit: false
-hasEHCatchret: false
-hasEHScopes: false
-hasEHFunclets: false
-isOutlined: false
-debugInstrRef: true
-failsVerification: false
-tracksDebugUserValues: false
-registers:
- - { id: 0, class: gr64, preferred-register: '' }
- - { id: 1, class: gr64_with_sub_8bit, preferred-register: '' }
- - { id: 2, class: gr32, preferred-register: '' }
- - { id: 3, class: gr64_with_sub_8bit, preferred-register: '' }
- - { id: 4, class: gr32, preferred-register: '' }
- - { id: 5, class: gr64, preferred-register: '' }
-liveins:
- - { reg: '$rdi', virtual-reg: '%0' }
-frameInfo:
- isFrameAddressTaken: false
- isReturnAddressTaken: false
- hasStackMap: false
- hasPatchPoint: false
- stackSize: 0
- offsetAdjustment: 0
- maxAlignment: 1
- adjustsStack: false
- hasCalls: false
- stackProtector: ''
- functionContext: ''
- maxCallFrameSize: 4294967295
- cvBytesOfCalleeSavedRegisters: 0
- hasOpaqueSPAdjustment: false
- hasVAStart: false
- hasMustTailInVarArgFunc: false
- hasTailCall: false
- localFrameSize: 0
- savePoint: ''
- restorePoint: ''
-fixedStack: []
-stack: []
-entry_values: []
-callSites: []
-debugValueSubstitutions: []
-constants: []
-machineFunctionInfo: {}
-body: |
- bb.0.entry:
- liveins: $rdi
-
- %0:gr64 = COPY $rdi
- %1:gr64_with_sub_8bit = MOVSX64rm32 $rip, 1, $noreg, @glb, $noreg :: (dereferenceable load (s32) from @glb, align 16, !tbaa !2)
- MOV32mr %0, 1, $noreg, 0, $noreg, %1.sub_32bit :: (store (s32) into %ir.p, !tbaa !2)
- %3:gr64_with_sub_8bit = MOVSX64rm32 $rip, 1, $noreg, @glb + 4, $noreg :: (dereferenceable load (s32) from `ptr getelementptr inbounds ([100 x i32], ptr @glb, i64 0, i64 1)`, !tbaa !2)
- MOV32mr %0, 1, $noreg, 4, $noreg, %3.sub_32bit :: (store (s32) into %ir.arrayidx1, !tbaa !2)
- %5:gr64 = COPY %3
- %5:gr64 = nsw ADD64rr %5, %1, implicit-def dead $eflags
- FAKE_USE %5
- FAKE_USE %3.sub_32bit
- FAKE_USE %1.sub_32bit
- FAKE_USE %0
- $rax = COPY %5
- RET 0, killed $rax
-
-...
----
-name: bar
-alignment: 16
-exposesReturnsTwice: false
-legalized: false
-regBankSelected: false
-selected: false
-failedISel: false
-tracksRegLiveness: true
-hasWinCFI: false
-callsEHReturn: false
-callsUnwindInit: false
-hasEHCatchret: false
-hasEHScopes: false
-hasEHFunclets: false
-isOutlined: false
-debugInstrRef: true
-failsVerification: false
-tracksDebugUserValues: false
-registers:
- - { id: 0, class: gr64, preferred-register: '' }
- - { id: 1, class: gr64_with_sub_8bit, preferred-register: '' }
- - { id: 2, class: gr32, preferred-register: '' }
- - { id: 3, class: gr64_with_sub_8bit, preferred-register: '' }
- - { id: 4, class: gr32, preferred-register: '' }
- - { id: 5, class: gr64_with_sub_8bit, preferred-register: '' }
-liveins:
- - { reg: '$rdi', virtual-reg: '%0' }
-frameInfo:
- isFrameAddressTaken: false
- isReturnAddressTaken: false
- hasStackMap: false
- hasPatchPoint: false
- stackSize: 0
- offsetAdjustment: 0
- maxAlignment: 1
- adjustsStack: false
- hasCalls: false
- stackProtector: ''
- functionContext: ''
- maxCallFrameSize: 4294967295
- cvBytesOfCalleeSavedRegisters: 0
- hasOpaqueSPAdjustment: false
- hasVAStart: false
- hasMustTailInVarArgFunc: false
- hasTailCall: false
- localFrameSize: 0
- savePoint: ''
- restorePoint: ''
-fixedStack: []
-stack: []
-entry_values: []
-callSites: []
-debugValueSubstitutions: []
-constants: []
-machineFunctionInfo: {}
-body: |
- bb.0.entry:
- liveins: $rdi
-
- %0:gr64 = COPY $rdi
- %1:gr64_with_sub_8bit = MOVSX64rm32 $rip, 1, $noreg, @glb, $noreg :: (dereferenceable load (s32) from @glb, align 16, !tbaa !2)
- MOV32mr %0, 1, $noreg, 0, $noreg, %1.sub_32bit :: (store (s32) into %ir.p, !tbaa !2)
- %5:gr64_with_sub_8bit = MOVSX64rm32 $rip, 1, $noreg, @glb + 4, $noreg :: (dereferenceable load (s32) from `ptr getelementptr inbounds ([100 x i32], ptr @glb, i64 0, i64 1)`, !tbaa !2)
- MOV32mr %0, 1, $noreg, 4, $noreg, %5.sub_32bit :: (store (s32) into %ir.arrayidx1, !tbaa !2)
- %5:gr64_with_sub_8bit = nsw ADD64rr %5, %1, implicit-def dead $eflags
- $rax = COPY %5
- RET 0, killed $rax
-
-...
diff --git a/llvm/test/CodeGen/MIR/X86/fake-use-tailcall.mir b/llvm/test/CodeGen/MIR/X86/fake-use-tailcall.mir
index 89d3854ac95f88..7eb8915f26a80f 100644
--- a/llvm/test/CodeGen/MIR/X86/fake-use-tailcall.mir
+++ b/llvm/test/CodeGen/MIR/X86/fake-use-tailcall.mir
@@ -35,7 +35,7 @@
# CHECK-NEXT: ret
--- |
- define hidden i32 @foo(i32 %i) local_unnamed_addr {
+ define hidden i32 @foo(i32 %i) local_unnamed_addr optdebug {
entry:
%cmp = icmp eq i32 %i, 0
br i1 %cmp, label %if.then, label %if.else
@@ -56,13 +56,6 @@
}
declare i32 @f0(...) local_unnamed_addr
declare i32 @f1(...) local_unnamed_addr
- declare void @llvm.fake.use(...)
-
- !llvm.module.flags = !{!0}
- !llvm.ident = !{!1}
-
- !0 = !{i32 1, !"wchar_size", i32 2}
- !1 = !{!"clang version 10.0.0"}
...
---
diff --git a/llvm/test/CodeGen/X86/fake-use-hpfloat.ll b/llvm/test/CodeGen/X86/fake-use-hpfloat.ll
index 74ab8f9b76e5c9..7a95c38801837c 100644
--- a/llvm/test/CodeGen/X86/fake-use-hpfloat.ll
+++ b/llvm/test/CodeGen/X86/fake-use-hpfloat.ll
@@ -8,10 +8,8 @@
;
target triple = "x86_64-unknown-unknown"
-define void @_Z6doTestv() local_unnamed_addr {
+define void @_Z6doTestv() local_unnamed_addr optdebug {
entry:
tail call void (...) @llvm.fake.use(half 0xH0000)
ret void
}
-
-declare void @llvm.fake.use(...)
diff --git a/llvm/test/CodeGen/X86/fake-use-ld.ll b/llvm/test/CodeGen/X86/fake-use-ld.ll
index 90ecff6dd59680..86e7235091dd1c 100644
--- a/llvm/test/CodeGen/X86/fake-use-ld.ll
+++ b/llvm/test/CodeGen/X86/fake-use-ld.ll
@@ -10,7 +10,7 @@
; }
; /*******************************************************************/
-define x86_fp80 @actual(x86_fp80 %p1, x86_fp80 %p2, x86_fp80 %p3) {
+define x86_fp80 @actual(x86_fp80 %p1, x86_fp80 %p2, x86_fp80 %p3) optdebug {
;
; CHECK: actual
;
@@ -41,11 +41,3 @@ entry:
}
declare x86_fp80 @foo(x86_fp80, x86_fp80, x86_fp80)
-
-declare void @llvm.fake.use(...)
-
-!llvm.module.flags = !{!0}
-!llvm.ident = !{!1}
-
-!0 = !{i32 1, !"PIC Level", i32 2}
-!1 = !{!"clang version 3.9.0"}
diff --git a/llvm/test/CodeGen/X86/fake-use-scheduler.mir b/llvm/test/CodeGen/X86/fake-use-scheduler.mir
index b2871f4b06ccdb..7e55f1d79aa7b6 100644
--- a/llvm/test/CodeGen/X86/fake-use-scheduler.mir
+++ b/llvm/test/CodeGen/X86/fake-use-scheduler.mir
@@ -26,7 +26,7 @@
@glb = common dso_local local_unnamed_addr global [100 x i32] zeroinitializer, align 16
- define dso_local i64 @foo(ptr %p) local_unnamed_addr {
+ define dso_local i64 @foo(ptr %p) local_unnamed_addr optdebug {
entry:
%0 = load i32, ptr @glb, align 16
store i32 %0, ptr %p, align 4
@@ -43,7 +43,7 @@
ret i64 %add3
}
- define dso_local i64 @bar(ptr %p) local_unnamed_addr {
+ define dso_local i64 @bar(ptr %p) local_unnamed_addr optdebug {
entry:
%0 = load i32, ptr @glb, align 16
store i32 %0, ptr %p, align 4
@@ -56,14 +56,8 @@
ret i64 %add3
}
- ; Function Attrs: nounwind
- declare void @llvm.fake.use(...) #0
-
; Function Attrs: nocallback nofree nosync nounwind willreturn
- declare void @llvm.stackprotector(ptr, ptr) #1
-
- attributes #0 = { nounwind }
- attributes #1 = { nocallback nofree nosync nounwind willreturn }
+ declare void @llvm.stackprotector(ptr, ptr)
...
---
diff --git a/llvm/test/CodeGen/X86/fake-use-simple-tail-call.ll b/llvm/test/CodeGen/X86/fake-use-simple-tail-call.ll
index 06b4aae48447cc..45a210ef391009 100644
--- a/llvm/test/CodeGen/X86/fake-use-simple-tail-call.ll
+++ b/llvm/test/CodeGen/X86/fake-use-simple-tail-call.ll
@@ -14,7 +14,7 @@
; ModuleID = 'test.cpp'
source_filename = "test.cpp"
-define i32 @_Z4foo1i(i32 %i) local_unnamed_addr {
+define i32 @_Z4foo1i(i32 %i) local_unnamed_addr optdebug {
entry:
%call = tail call i32 @_Z3bari(i32 %i)
tail call void (...) @llvm.fake.use(i32 %i)
@@ -22,12 +22,3 @@ entry:
}
declare i32 @_Z3bari(i32) local_unnamed_addr
-
-declare void @llvm.fake.use(...)
-
-!llvm.module.flags = !{!0, !1}
-!llvm.ident = !{!2}
-
-!0 = !{i32 1, !"wchar_size", i32 2}
-!1 = !{i32 7, !"PIC Level", i32 2}
-!2 = !{!"clang version 5.0.1"}
diff --git a/llvm/test/CodeGen/X86/fake-use-suppress-load.ll b/llvm/test/CodeGen/X86/fake-use-suppress-load.ll
index 9f18bac0847cc3..c1b442ebd79ffa 100644
--- a/llvm/test/CodeGen/X86/fake-use-suppress-load.ll
+++ b/llvm/test/CodeGen/X86/fake-use-suppress-load.ll
@@ -6,11 +6,9 @@
; CHECK: movq %r{{[a-z]+,}} -{{[0-9]+\(%rsp\)}}
; CHECK-NOT: movq -{{[0-9]+\(%rsp\)}}, %r{{[a-z]+}}
-define dso_local i32 @f(ptr %p) local_unnamed_addr {
+define dso_local i32 @f(ptr %p) local_unnamed_addr optdebug {
entry:
call void asm sideeffect "", "~{rax},~{rbx},~{rcx},~{rdx},~{rsi},~{rdi},~{rbp},~{r8},~{r9},~{r10},~{r11},~{r12},~{r13},~{r14},~{r15},~{dirflag},~{fpsr},~{flags}"() #1
notail call void (...) @llvm.fake.use(ptr %p)
ret i32 4
}
-
-declare void @llvm.fake.use(...)
diff --git a/llvm/test/CodeGen/X86/fake-use-tailcall.ll b/llvm/test/CodeGen/X86/fake-use-tailcall.ll
index 0b91172fb2fd7f..10bb22e1b564ab 100644
--- a/llvm/test/CodeGen/X86/fake-use-tailcall.ll
+++ b/llvm/test/CodeGen/X86/fake-use-tailcall.ll
@@ -17,7 +17,7 @@
; CHECK: FAKE_USE %0
; CHECK: TCRETURN
-define void @bar(i32 %v) {
+define void @bar(i32 %v) optdebug {
entry:
%call = tail call i32 @_Z3fooi(i32 %v)
%mul = mul nsw i32 %call, 3
@@ -27,7 +27,7 @@ entry:
ret void
}
-define i32 @baz(i32 %v) {
+define i32 @baz(i32 %v) optdebug {
entry:
%call = tail call i32 @_Z3fooi(i32 %v)
notail call void (...) @llvm.fake.use(i32 %v)
@@ -35,4 +35,3 @@ entry:
}
declare i32 @_Z3fooi(i32) local_unnamed_addr
-declare void @llvm.fake.use(...)
diff --git a/llvm/test/CodeGen/X86/fake-use-vector.ll b/llvm/test/CodeGen/X86/fake-use-vector.ll
index 699f3607da0af6..f320ee5e6d766b 100644
--- a/llvm/test/CodeGen/X86/fake-use-vector.ll
+++ b/llvm/test/CodeGen/X86/fake-use-vector.ll
@@ -36,10 +36,4 @@ declare <4 x float> @llvm.x86.sse.cvtpi2ps(<4 x float>, x86_mmx)
; Function Attrs: nounwind
declare void @llvm.fake.use(...)
-attributes #0 = { "target-cpu"="btver2" }
-
-!llvm.module.flags = !{!0}
-!llvm.ident = !{!1}
-
-!0 = !{i32 1, !"PIC Level", i32 2}
-!1 = !{!"clang version 5.0.0"}
+attributes #0 = { "target-cpu"="btver2" optdebug }
diff --git a/llvm/test/CodeGen/X86/fake-use-vector2.ll b/llvm/test/CodeGen/X86/fake-use-vector2.ll
index 08d50504a81b8d..6f2d3a5566dc67 100644
--- a/llvm/test/CodeGen/X86/fake-use-vector2.ll
+++ b/llvm/test/CodeGen/X86/fake-use-vector2.ll
@@ -24,10 +24,4 @@ entry:
declare void @llvm.fake.use(...)
-attributes #0 = { "target-cpu"="btver2" }
-
-!llvm.module.flags = !{!0}
-!llvm.ident = !{!1}
-
-!0 = !{i32 1, !"PIC Level", i32 2}
-!1 = !{!"clang version 5.0.0"}
+attributes #0 = { "target-cpu"="btver2" optdebug }
diff --git a/llvm/test/CodeGen/X86/fake-use-zero-length.ll b/llvm/test/CodeGen/X86/fake-use-zero-length.ll
index 83bf674e2e6f48..e8c6791b8edff2 100644
--- a/llvm/test/CodeGen/X86/fake-use-zero-length.ll
+++ b/llvm/test/CodeGen/X86/fake-use-zero-length.ll
@@ -17,7 +17,7 @@
source_filename = "test.ll"
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
-define hidden i32 @main([0 x i32] %zero, [1 x i32] %one) local_unnamed_addr {
+define hidden i32 @main([0 x i32] %zero, [1 x i32] %one) local_unnamed_addr optdebug {
entry:
notail call void (...) @bar([0 x i32] %zero)
notail call void (...) @baz([1 x i32] %one)
@@ -28,6 +28,3 @@ entry:
declare void @bar([0 x i32] %a)
declare void @baz([1 x i32] %a)
-
-; Function Attrs: nounwind
-declare void @llvm.fake.use(...)
diff --git a/llvm/test/DebugInfo/X86/fake-use.ll b/llvm/test/DebugInfo/X86/fake-use.ll
index dab7c92d428049..9dbe2a4357f110 100644
--- a/llvm/test/DebugInfo/X86/fake-use.ll
+++ b/llvm/test/DebugInfo/X86/fake-use.ll
@@ -39,7 +39,7 @@ source_filename = "t2.c"
@glob = common local_unnamed_addr global [10 x i32] zeroinitializer, align 16, !dbg !0
; Function Attrs: nounwind sspstrong uwtable
-define i32 @foo(i32 %b, i32 %i) local_unnamed_addr !dbg !13 {
+define i32 @foo(i32 %b, i32 %i) local_unnamed_addr optdebug !dbg !13 {
entry:
tail call void @llvm.dbg.value(metadata i32 %b, i64 0, metadata !17, metadata !20), !dbg !21
%c = add i32 %b, 42
@@ -64,12 +64,6 @@ if.end: ; preds = %entry, %if.then
declare void @bar(...) local_unnamed_addr
-; Function Attrs: nounwind
-declare void @llvm.fake.use(...)
-
-; Function Attrs: nounwind readnone
-declare void @llvm.dbg.value(metadata, i64, metadata, metadata)
-
!llvm.dbg.cu = !{!1}
!llvm.module.flags = !{!9, !10, !11}
!llvm.ident = !{!12}
@@ -90,9 +84,8 @@ declare void @llvm.dbg.value(metadata, i64, metadata, metadata)
!13 = distinct !DISubprogram(name: "foo", scope: !2, file: !2, line: 4, type: !14, isLocal: false, isDefinition: true, scopeLine: 5, flags: DIFlagPrototyped, isOptimized: true, unit: !1, retainedNodes: !16)
!14 = !DISubroutineType(types: !15)
!15 = !{!6, !6, !6}
-!16 = !{!17, !18, !19}
+!16 = !{!17, !19}
!17 = !DILocalVariable(name: "b", arg: 1, scope: !13, file: !2, line: 4, type: !6)
-!18 = !DILocalVariable(name: "i", arg: 2, scope: !13, file: !2, line: 4, type: !6)
!19 = !DILocalVariable(name: "loc", scope: !13, file: !2, line: 6, type: !6)
!20 = !DIExpression()
!21 = !DILocation(line: 4, scope: !13)
diff --git a/llvm/test/Transforms/CodeGenPrepare/X86/fake-use-phi.ll b/llvm/test/Transforms/CodeGenPrepare/X86/fake-use-phi.ll
index ede651ee621f9b..064d3f29dd9ebb 100644
--- a/llvm/test/Transforms/CodeGenPrepare/X86/fake-use-phi.ll
+++ b/llvm/test/Transforms/CodeGenPrepare/X86/fake-use-phi.ll
@@ -25,7 +25,7 @@ source_filename = "test.ll"
declare i32 @foo(ptr nonnull dereferenceable(1)) local_unnamed_addr
declare i32 @bar(ptr nonnull dereferenceable(1)) local_unnamed_addr
-define hidden void @func(ptr nonnull dereferenceable(1) %this) local_unnamed_addr align 2 {
+define hidden void @func(ptr nonnull dereferenceable(1) %this) local_unnamed_addr align 2 optdebug {
entry:
%b = getelementptr inbounds %class.a, ptr %this, i64 0, i32 0
%0 = load i8, i8* %b, align 1
diff --git a/llvm/test/Transforms/CodeGenPrepare/X86/fake-use-split-ret.ll b/llvm/test/Transforms/CodeGenPrepare/X86/fake-use-split-ret.ll
index 9c6805ea31c0d4..b2cf89f6f2dd82 100644
--- a/llvm/test/Transforms/CodeGenPrepare/X86/fake-use-split-ret.ll
+++ b/llvm/test/Transforms/CodeGenPrepare/X86/fake-use-split-ret.ll
@@ -17,9 +17,7 @@
declare i32 @_Z3bari(i32) local_unnamed_addr
-declare void @llvm.fake.use(...)
-
-define i32 @_Z4foo2i(i32 %i) local_unnamed_addr {
+define i32 @_Z4foo2i(i32 %i) local_unnamed_addr optdebug {
entry:
%dec = add nsw i32 %i, -1
%cmp = icmp slt i32 %i, 2
diff --git a/llvm/test/Transforms/GVN/fake-use-constprop.ll b/llvm/test/Transforms/GVN/fake-use-constprop.ll
index 11ed310083db17..1466f9f9fca277 100644
--- a/llvm/test/Transforms/GVN/fake-use-constprop.ll
+++ b/llvm/test/Transforms/GVN/fake-use-constprop.ll
@@ -37,7 +37,7 @@
; CHECK: call {{.+}} @bees(i8 0)
; CHECK: call {{.+}} @llvm.fake.use(i8 %[[CONV_VAR]])
-define i32 @foo(float %f) {
+define i32 @foo(float %f) optdebug {
%conv = fptosi float %f to i8
%tobool3 = icmp eq i8 %conv, 0
br i1 %tobool3, label %if.end, label %lab
@@ -58,12 +58,3 @@ declare void @baz(i32)
declare void @bees(i32)
declare void @func1(...)
-
-; Function Attrs: nounwind
-declare void @llvm.fake.use(...)
-
-!llvm.module.flags = !{!0}
-!llvm.ident = !{!1}
-
-!0 = !{i32 1, !"PIC Level", i32 2}
-!1 = !{!"clang version 3.9.0"}
diff --git a/llvm/test/Transforms/SROA/fake-use-sroa.ll b/llvm/test/Transforms/SROA/fake-use-sroa.ll
index 883a48fdd66954..9e92df15487506 100644
--- a/llvm/test/Transforms/SROA/fake-use-sroa.ll
+++ b/llvm/test/Transforms/SROA/fake-use-sroa.ll
@@ -25,7 +25,7 @@
; CHECK: %[[SLICE2:[^ ]+]] = trunc i64
; CHECK-DAG: call{{.*}} @llvm.fake.use(i32 %[[SLICE1]])
; CHECK-DAG: call{{.*}} @llvm.fake.use(i32 %[[SLICE2]])
-define dso_local void @foo(i64 %S.coerce) {
+define dso_local void @foo(i64 %S.coerce) optdebug {
entry:
%S = alloca %struct.s, align 4
store i64 %S.coerce, ptr %S, align 4
@@ -34,15 +34,13 @@ entry:
ret void
}
-declare void @llvm.fake.use(...)
-
; A local variable with a small array type.
; CHECK-LABEL: define{{.*}}bar
; CHECK: %[[ARRAYSLICE1:[^ ]+]] = load
; CHECK: %[[ARRAYSLICE2:[^ ]+]] = load
; CHECK-DAG: call{{.*}} @llvm.fake.use(i32 %[[ARRAYSLICE1]])
; CHECK-DAG: call{{.*}} @llvm.fake.use(i32 %[[ARRAYSLICE2]])
-define dso_local void @bar() {
+define dso_local void @bar() optdebug {
entry:
%arr = alloca [2 x i32], align 4
call void @llvm.memcpy.p0i8.p0i8.i64(ptr align 4 %arr, ptr align 4 bitcast (ptr @__const.bar.arr to ptr), i64 8, i1 false)
>From 0bce1f8d25d662d01c498bafc04854059783ecc9 Mon Sep 17 00:00:00 2001
From: Stephen Tozer <stephen.tozer at sony.com>
Date: Fri, 7 Jun 2024 15:36:09 +0100
Subject: [PATCH 10/20] Add pass to remove loads that only feed into FAKE_USEs
---
llvm/include/llvm/CodeGen/Passes.h | 3 +
llvm/include/llvm/InitializePasses.h | 1 +
.../llvm/Passes/MachinePassRegistry.def | 1 +
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp | 56 ++----
llvm/lib/CodeGen/CMakeLists.txt | 1 +
llvm/lib/CodeGen/CodeGen.cpp | 1 +
llvm/lib/CodeGen/RemoveLoadsIntoFakeUses.cpp | 170 ++++++++++++++++++
llvm/lib/CodeGen/TargetPassConfig.cpp | 2 +
llvm/test/CodeGen/AArch64/O0-pipeline.ll | 1 +
llvm/test/CodeGen/AArch64/O3-pipeline.ll | 1 +
llvm/test/CodeGen/AMDGPU/llc-pipeline.ll | 5 +
llvm/test/CodeGen/ARM/O3-pipeline.ll | 1 +
llvm/test/CodeGen/LoongArch/O0-pipeline.ll | 1 +
llvm/test/CodeGen/LoongArch/opt-pipeline.ll | 1 +
llvm/test/CodeGen/PowerPC/O0-pipeline.ll | 1 +
llvm/test/CodeGen/PowerPC/O3-pipeline.ll | 1 +
llvm/test/CodeGen/RISCV/O0-pipeline.ll | 1 +
llvm/test/CodeGen/RISCV/O3-pipeline.ll | 1 +
llvm/test/CodeGen/X86/O0-pipeline.ll | 1 +
llvm/test/CodeGen/X86/opt-pipeline.ll | 1 +
.../gn/secondary/llvm/lib/CodeGen/BUILD.gn | 1 +
21 files changed, 213 insertions(+), 39 deletions(-)
create mode 100644 llvm/lib/CodeGen/RemoveLoadsIntoFakeUses.cpp
diff --git a/llvm/include/llvm/CodeGen/Passes.h b/llvm/include/llvm/CodeGen/Passes.h
index c7c2178571215b..dbdd110b0600e5 100644
--- a/llvm/include/llvm/CodeGen/Passes.h
+++ b/llvm/include/llvm/CodeGen/Passes.h
@@ -440,6 +440,9 @@ namespace llvm {
// metadata after llvm SanitizerBinaryMetadata pass.
extern char &MachineSanitizerBinaryMetadataID;
+ /// RemoveLoadsIntoFakeUses pass.
+ extern char &RemoveLoadsIntoFakeUsesID;
+
/// RemoveRedundantDebugValues pass.
extern char &RemoveRedundantDebugValuesID;
diff --git a/llvm/include/llvm/InitializePasses.h b/llvm/include/llvm/InitializePasses.h
index cc5e93c58f564a..47a1ca15fc0d1f 100644
--- a/llvm/include/llvm/InitializePasses.h
+++ b/llvm/include/llvm/InitializePasses.h
@@ -264,6 +264,7 @@ void initializeRegionOnlyViewerPass(PassRegistry &);
void initializeRegionPrinterPass(PassRegistry &);
void initializeRegionViewerPass(PassRegistry &);
void initializeRegisterCoalescerPass(PassRegistry &);
+void initializeRemoveLoadsIntoFakeUsesPass(PassRegistry &);
void initializeRemoveRedundantDebugValuesPass(PassRegistry &);
void initializeRenameIndependentSubregsPass(PassRegistry &);
void initializeReplaceWithVeclibLegacyPass(PassRegistry &);
diff --git a/llvm/include/llvm/Passes/MachinePassRegistry.def b/llvm/include/llvm/Passes/MachinePassRegistry.def
index 05baf514fa7210..b710b1c46f643f 100644
--- a/llvm/include/llvm/Passes/MachinePassRegistry.def
+++ b/llvm/include/llvm/Passes/MachinePassRegistry.def
@@ -250,6 +250,7 @@ DUMMY_MACHINE_FUNCTION_PASS("reg-usage-propagation", RegUsageInfoPropagationPass
DUMMY_MACHINE_FUNCTION_PASS("regalloc", RegAllocPass)
DUMMY_MACHINE_FUNCTION_PASS("regallocscoringpass", RegAllocScoringPass)
DUMMY_MACHINE_FUNCTION_PASS("regbankselect", RegBankSelectPass)
+DUMMY_MACHINE_FUNCTION_PASS("remove-loads-into-fake-uses", RemoveLoadsIntoFakeUsesPass)
DUMMY_MACHINE_FUNCTION_PASS("removeredundantdebugvalues", RemoveRedundantDebugValuesPass)
DUMMY_MACHINE_FUNCTION_PASS("rename-independent-subregs", RenameIndependentSubregsPass)
DUMMY_MACHINE_FUNCTION_PASS("reset-machine-function", ResetMachineFunctionPass)
diff --git a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
index 268aa3736a38f8..67f801c06f997b 100644
--- a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
@@ -1073,44 +1073,11 @@ void AsmPrinter::emitFunctionEntryLabel() {
}
}
-// Recognize cases where a spilled register is reloaded solely to feed into a
-// FAKE_USE.
-static bool isLoadFeedingIntoFakeUse(const MachineInstr &MI, const TargetInstrInfo *TII) {
- const MachineFunction *MF = MI.getMF();
-
- // If the restore size is std::nullopt then we are not dealing with a reload
- // of a spilled register.
- if (!MI.getRestoreSize(TII))
- return false;
-
- // Check if this register is the operand of a FAKE_USE and
- // does it have the kill flag set there.
- auto NextI = std::next(MI.getIterator());
- if (NextI == MI.getParent()->end() || !NextI->isFakeUse())
- return false;
-
- unsigned Reg = MI.getOperand(0).getReg();
- for (const MachineOperand &MO : NextI->operands()) {
- // Return true if we came across the register from the
- // previous spill instruction that is killed in NextI.
- if (MO.isReg() && MO.isUse() && MO.isKill() && MO.getReg() == Reg)
- return true;
- }
-
- return false;
-}
-
/// emitComments - Pretty-print comments for instructions.
static void emitComments(const MachineInstr &MI, raw_ostream &CommentOS) {
const MachineFunction *MF = MI.getMF();
const TargetInstrInfo *TII = MF->getSubtarget().getInstrInfo();
- // When we're loading a value that is only going to be used by a FAKE_USE, we
- // will skip emitting it in AsmPrinter::emitFunctionBody, so we also skip
- // emitting a comment for it here.
- if (isLoadFeedingIntoFakeUse(MI, TII))
- return;
-
// Check for spills and reloads
// We assume a single instruction only has a spill or reload, not
@@ -1164,6 +1131,21 @@ static void emitKill(const MachineInstr *MI, AsmPrinter &AP) {
AP.OutStreamer->addBlankLine();
}
+static void emitFakeUse(const MachineInstr *MI, AsmPrinter &AP) {
+ std::string Str;
+ raw_string_ostream OS(Str);
+ OS << "fake_use:";
+ for (const MachineOperand &Op : MI->operands()) {
+ // In some circumstances we can end up with fake uses of constants; skip
+ // these.
+ if (!Op.isReg())
+ continue;
+ OS << ' ' << printReg(Op.getReg(), AP.MF->getSubtarget().getRegisterInfo());
+ }
+ AP.OutStreamer->AddComment(OS.str());
+ AP.OutStreamer->addBlankLine();
+}
+
/// emitDebugValueComment - This method handles the target-independent form
/// of DBG_VALUE, returning true if it was able to do so. A false return
/// means the target will need to handle MI in EmitInstruction.
@@ -1834,6 +1816,8 @@ void AsmPrinter::emitFunctionBody() {
if (isVerbose()) emitKill(&MI, *this);
break;
case TargetOpcode::FAKE_USE:
+ if (isVerbose())
+ emitFakeUse(&MI, *this);
break;
case TargetOpcode::PSEUDO_PROBE:
emitPseudoProbe(MI);
@@ -1850,12 +1834,6 @@ void AsmPrinter::emitFunctionBody() {
// purely meta information.
break;
default:
- // If this is a reload of a spilled register that only feeds into a
- // FAKE_USE instruction, meaning the load value has no effect on the
- // program and has only been kept alive for debugging; since it is
- // still available on the stack, we can skip the load itself.
- if (isLoadFeedingIntoFakeUse(MI, TII))
- break;
emitInstruction(&MI);
if (CanDoExtraAnalysis) {
MCInst MCI;
diff --git a/llvm/lib/CodeGen/CMakeLists.txt b/llvm/lib/CodeGen/CMakeLists.txt
index f1607f85c5b319..ae12ce1170f703 100644
--- a/llvm/lib/CodeGen/CMakeLists.txt
+++ b/llvm/lib/CodeGen/CMakeLists.txt
@@ -200,6 +200,7 @@ add_llvm_component_library(LLVMCodeGen
RegisterUsageInfo.cpp
RegUsageInfoCollector.cpp
RegUsageInfoPropagate.cpp
+ RemoveLoadsIntoFakeUses.cpp
ReplaceWithVeclib.cpp
ResetMachineFunctionPass.cpp
RegisterBank.cpp
diff --git a/llvm/lib/CodeGen/CodeGen.cpp b/llvm/lib/CodeGen/CodeGen.cpp
index 31fa4c105cef80..177702054a0e31 100644
--- a/llvm/lib/CodeGen/CodeGen.cpp
+++ b/llvm/lib/CodeGen/CodeGen.cpp
@@ -116,6 +116,7 @@ void llvm::initializeCodeGen(PassRegistry &Registry) {
initializeRegUsageInfoCollectorPass(Registry);
initializeRegUsageInfoPropagationPass(Registry);
initializeRegisterCoalescerPass(Registry);
+ initializeRemoveLoadsIntoFakeUsesPass(Registry);
initializeRemoveRedundantDebugValuesPass(Registry);
initializeRenameIndependentSubregsPass(Registry);
initializeSafeStackLegacyPassPass(Registry);
diff --git a/llvm/lib/CodeGen/RemoveLoadsIntoFakeUses.cpp b/llvm/lib/CodeGen/RemoveLoadsIntoFakeUses.cpp
new file mode 100644
index 00000000000000..0777c774151cd6
--- /dev/null
+++ b/llvm/lib/CodeGen/RemoveLoadsIntoFakeUses.cpp
@@ -0,0 +1,170 @@
+//===---- RemoveLoadsIntoFakeUses.cpp - Remove loads with no real uses ----===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+///
+/// \file
+/// The FAKE_USE instruction is used to preserve certain values through
+/// optimizations for the sake of debugging. This may result in spilled values
+/// being loaded into registers that are only used by FAKE_USEs; this is not
+/// necessary for debugging purposes, because at that point the value must be on
+/// the stack and hence available for debugging. Therefore, this pass removes
+/// loads that are only used by FAKE_USEs.
+///
+/// This pass should run as late as possible, to ensure that we don't
+/// inadvertently shorten stack lifetimes by removing these loads, since the
+/// FAKE_USEs will also no longer be in effect.
+///
+//===----------------------------------------------------------------------===//
+
+#include "llvm/ADT/PostOrderIterator.h"
+#include "llvm/ADT/Statistic.h"
+#include "llvm/CodeGen/LiveRegUnits.h"
+#include "llvm/CodeGen/MachineFunction.h"
+#include "llvm/CodeGen/MachineFunctionPass.h"
+#include "llvm/CodeGen/MachineRegisterInfo.h"
+#include "llvm/CodeGen/TargetSubtargetInfo.h"
+#include "llvm/IR/Function.h"
+#include "llvm/InitializePasses.h"
+#include "llvm/Support/Debug.h"
+
+using namespace llvm;
+
+#define DEBUG_TYPE "remove-loads-into-fake-uses"
+
+STATISTIC(NumLoadsDeleted, "Number of dead load instructions deleted");
+STATISTIC(NumFakeUsesDeleted, "Number of FAKE_USE instructions deleted");
+
+class RemoveLoadsIntoFakeUses : public MachineFunctionPass {
+public:
+ static char ID;
+
+ RemoveLoadsIntoFakeUses() : MachineFunctionPass(ID) {
+ initializeRemoveLoadsIntoFakeUsesPass(*PassRegistry::getPassRegistry());
+ }
+
+ void getAnalysisUsage(AnalysisUsage &AU) const override {
+ AU.setPreservesCFG();
+ MachineFunctionPass::getAnalysisUsage(AU);
+ }
+
+ StringRef getPassName() const override {
+ return "Remove Loads Into Fake Uses";
+ }
+
+ bool runOnMachineFunction(MachineFunction &MF) override;
+};
+
+char RemoveLoadsIntoFakeUses::ID = 0;
+char &llvm::RemoveLoadsIntoFakeUsesID = RemoveLoadsIntoFakeUses::ID;
+
+INITIALIZE_PASS_BEGIN(RemoveLoadsIntoFakeUses, DEBUG_TYPE,
+ "Remove Loads Into Fake Uses", false, false)
+INITIALIZE_PASS_END(RemoveLoadsIntoFakeUses, DEBUG_TYPE,
+ "Remove Loads Into Fake Uses", false, false)
+
+bool RemoveLoadsIntoFakeUses::runOnMachineFunction(MachineFunction &MF) {
+ // Only `optdebug` functions should contain FAKE_USEs, so don't try to run
+ // this for other functions.
+ dbgs() << "We might run the pass!\n";
+ if (!MF.getFunction().hasFnAttribute(Attribute::OptimizeForDebugging) ||
+ skipFunction(MF.getFunction()))
+ return false;
+
+ dbgs() << "We could run the pass!\n";
+ // This implementation assumes we are post-RA.
+ if (!MF.getProperties().hasProperty(
+ MachineFunctionProperties::Property::NoVRegs))
+ return false;
+
+ bool AnyChanges = false;
+
+ LiveRegUnits LivePhysRegs;
+ const MachineRegisterInfo *MRI = &MF.getRegInfo();
+ const TargetSubtargetInfo &ST = MF.getSubtarget();
+ const TargetInstrInfo *TII = ST.getInstrInfo();
+ const TargetRegisterInfo *TRI = ST.getRegisterInfo();
+
+ SmallDenseMap<Register, SmallVector<MachineInstr *>> RegFakeUses;
+ LivePhysRegs.init(*TRI);
+ dbgs() << "Running the pass!\n";
+ SmallVector<MachineInstr *, 16> Statepoints;
+ for (MachineBasicBlock *MBB : post_order(&MF)) {
+ LivePhysRegs.addLiveOuts(*MBB);
+
+ for (MachineInstr &MI : make_early_inc_range(reverse(*MBB))) {
+ if (MI.isFakeUse()) {
+ dbgs() << "Seen fake use " << MI << "\n";
+ for (const MachineOperand &MO : MI.operands()) {
+ // Track the Fake Uses that use this register so that we can delete
+ // them if we delete the corresponding load.
+ if (MO.isReg()) {
+ dbgs() << " Uses Reg: ";
+ TRI->dumpReg(MO.getReg());
+ dbgs() << "\n";
+ RegFakeUses[MO.getReg()].push_back(&MI);
+ }
+ }
+ // Do not record FAKE_USE uses in LivePhysRegs so that we can recognize
+ // otherwise-unused loads.
+ continue;
+ }
+
+ // If the restore size is not std::nullopt then we are dealing with a
+ // reload of a spilled register.
+ if (MI.getRestoreSize(TII)) {
+ dbgs() << "Seen restore of spill " << MI << "\n";
+ Register Reg = MI.getOperand(0).getReg();
+ assert(Reg.isPhysical() && "VReg seen in function with NoVRegs set?");
+ // Don't delete live physreg defs, or any reserved register defs.
+ if (!LivePhysRegs.available(Reg) || MRI->isReserved(Reg)) {
+ dbgs() << "Reg is live: ";
+ TRI->dumpReg(Reg);
+ dbgs() << "\n";
+ continue;
+ }
+ // There should be an exact match between the loaded register and the
+ // FAKE_USE use. If not, this is a load that is unused by anything? It
+ // should probably be deleted, but that's outside of this pass' scope.
+ if (RegFakeUses.contains(Reg)) {
+ LLVM_DEBUG(dbgs() << "RemoveLoadsIntoFakeUses: DELETING: " << MI);
+ // It is possible that some DBG_VALUE instructions refer to this
+ // instruction. They will be deleted in the live debug variable
+ // analysis.
+ MI.eraseFromParent();
+ AnyChanges = true;
+ ++NumLoadsDeleted;
+ // Each FAKE_USE now appears to be a fake use of the previous value
+ // of the loaded register; delete them to avoid incorrectly
+ // interpreting them as such.
+ for (MachineInstr *FakeUse : RegFakeUses[Reg])
+ FakeUse->eraseFromParent();
+ NumFakeUsesDeleted += RegFakeUses[Reg].size();
+ RegFakeUses[Reg].clear();
+ }
+ continue;
+ }
+
+ // In addition to tracking LivePhysRegs, we need to clear RegFakeUses each
+ // time a register is defined, as existing FAKE_USEs no longer apply to
+ // that register.
+ if (!RegFakeUses.empty()) {
+ for (const MachineOperand &MO : MI.operands()) {
+ if (MO.isReg() && MO.isDef()) {
+ Register Reg = MO.getReg();
+ assert(Reg.isPhysical() &&
+ "VReg seen in function with NoVRegs set?");
+ for (MCRegUnit Unit : TRI->regunits(Reg))
+ RegFakeUses.erase(Unit);
+ }
+ }
+ }
+ LivePhysRegs.stepBackward(MI);
+ }
+ }
+
+ return AnyChanges;
+}
diff --git a/llvm/lib/CodeGen/TargetPassConfig.cpp b/llvm/lib/CodeGen/TargetPassConfig.cpp
index 1d52ebe6717f04..b025ba7c750096 100644
--- a/llvm/lib/CodeGen/TargetPassConfig.cpp
+++ b/llvm/lib/CodeGen/TargetPassConfig.cpp
@@ -1269,6 +1269,8 @@ void TargetPassConfig::addMachinePasses() {
PM->add(createStackFrameLayoutAnalysisPass());
+ addPass(&RemoveLoadsIntoFakeUsesID);
+
// Add passes that directly emit MI after all other MI passes.
addPreEmitPass2();
diff --git a/llvm/test/CodeGen/AArch64/O0-pipeline.ll b/llvm/test/CodeGen/AArch64/O0-pipeline.ll
index ba611493e1a76e..83e9751459070a 100644
--- a/llvm/test/CodeGen/AArch64/O0-pipeline.ll
+++ b/llvm/test/CodeGen/AArch64/O0-pipeline.ll
@@ -80,6 +80,7 @@
; CHECK-NEXT: Lazy Machine Block Frequency Analysis
; CHECK-NEXT: Machine Optimization Remark Emitter
; CHECK-NEXT: Stack Frame Layout Analysis
+; CHECK-NEXT: Remove Loads Into Fake Uses
; CHECK-NEXT: Unpack machine instruction bundles
; CHECK-NEXT: Lazy Machine Block Frequency Analysis
; CHECK-NEXT: Machine Optimization Remark Emitter
diff --git a/llvm/test/CodeGen/AArch64/O3-pipeline.ll b/llvm/test/CodeGen/AArch64/O3-pipeline.ll
index 3465b717261cf5..75b109ad1e70c7 100644
--- a/llvm/test/CodeGen/AArch64/O3-pipeline.ll
+++ b/llvm/test/CodeGen/AArch64/O3-pipeline.ll
@@ -238,6 +238,7 @@
; CHECK-NEXT: Lazy Machine Block Frequency Analysis
; CHECK-NEXT: Machine Optimization Remark Emitter
; CHECK-NEXT: Stack Frame Layout Analysis
+; CHECK-NEXT: Remove Loads Into Fake Uses
; CHECK-NEXT: Unpack machine instruction bundles
; CHECK-NEXT: Lazy Machine Block Frequency Analysis
; CHECK-NEXT: Machine Optimization Remark Emitter
diff --git a/llvm/test/CodeGen/AMDGPU/llc-pipeline.ll b/llvm/test/CodeGen/AMDGPU/llc-pipeline.ll
index 29e8ebdafb5871..a69463c10413f7 100644
--- a/llvm/test/CodeGen/AMDGPU/llc-pipeline.ll
+++ b/llvm/test/CodeGen/AMDGPU/llc-pipeline.ll
@@ -148,6 +148,7 @@
; GCN-O0-NEXT: Lazy Machine Block Frequency Analysis
; GCN-O0-NEXT: Machine Optimization Remark Emitter
; GCN-O0-NEXT: Stack Frame Layout Analysis
+; GCN-O0-NEXT: Remove Loads Into Fake Uses
; GCN-O0-NEXT: Function register usage analysis
; GCN-O0-NEXT: FunctionPass Manager
; GCN-O0-NEXT: Lazy Machine Block Frequency Analysis
@@ -425,6 +426,7 @@
; GCN-O1-NEXT: Lazy Machine Block Frequency Analysis
; GCN-O1-NEXT: Machine Optimization Remark Emitter
; GCN-O1-NEXT: Stack Frame Layout Analysis
+; GCN-O1-NEXT: Remove Loads Into Fake Uses
; GCN-O1-NEXT: Function register usage analysis
; GCN-O1-NEXT: FunctionPass Manager
; GCN-O1-NEXT: Lazy Machine Block Frequency Analysis
@@ -730,6 +732,7 @@
; GCN-O1-OPTS-NEXT: Lazy Machine Block Frequency Analysis
; GCN-O1-OPTS-NEXT: Machine Optimization Remark Emitter
; GCN-O1-OPTS-NEXT: Stack Frame Layout Analysis
+; GCN-O1-OPTS-NEXT: Remove Loads Into Fake Uses
; GCN-O1-OPTS-NEXT: Function register usage analysis
; GCN-O1-OPTS-NEXT: FunctionPass Manager
; GCN-O1-OPTS-NEXT: Lazy Machine Block Frequency Analysis
@@ -1041,6 +1044,7 @@
; GCN-O2-NEXT: Lazy Machine Block Frequency Analysis
; GCN-O2-NEXT: Machine Optimization Remark Emitter
; GCN-O2-NEXT: Stack Frame Layout Analysis
+; GCN-O2-NEXT: Remove Loads Into Fake Uses
; GCN-O2-NEXT: Function register usage analysis
; GCN-O2-NEXT: FunctionPass Manager
; GCN-O2-NEXT: Lazy Machine Block Frequency Analysis
@@ -1364,6 +1368,7 @@
; GCN-O3-NEXT: Lazy Machine Block Frequency Analysis
; GCN-O3-NEXT: Machine Optimization Remark Emitter
; GCN-O3-NEXT: Stack Frame Layout Analysis
+; GCN-O3-NEXT: Remove Loads Into Fake Uses
; GCN-O3-NEXT: Function register usage analysis
; GCN-O3-NEXT: FunctionPass Manager
; GCN-O3-NEXT: Lazy Machine Block Frequency Analysis
diff --git a/llvm/test/CodeGen/ARM/O3-pipeline.ll b/llvm/test/CodeGen/ARM/O3-pipeline.ll
index 9b983d96f79330..96a13d1b50c9a1 100644
--- a/llvm/test/CodeGen/ARM/O3-pipeline.ll
+++ b/llvm/test/CodeGen/ARM/O3-pipeline.ll
@@ -201,6 +201,7 @@
; CHECK-NEXT: Lazy Machine Block Frequency Analysis
; CHECK-NEXT: Machine Optimization Remark Emitter
; CHECK-NEXT: Stack Frame Layout Analysis
+; CHECK-NEXT: Remove Loads Into Fake Uses
; CHECK-NEXT: ReachingDefAnalysis
; CHECK-NEXT: ARM fix for Cortex-A57 AES Erratum 1742098
; CHECK-NEXT: ARM Branch Targets
diff --git a/llvm/test/CodeGen/LoongArch/O0-pipeline.ll b/llvm/test/CodeGen/LoongArch/O0-pipeline.ll
index 38c1dbcb1075fa..6cae56c5c3a18b 100644
--- a/llvm/test/CodeGen/LoongArch/O0-pipeline.ll
+++ b/llvm/test/CodeGen/LoongArch/O0-pipeline.ll
@@ -68,6 +68,7 @@
; CHECK-NEXT: Lazy Machine Block Frequency Analysis
; CHECK-NEXT: Machine Optimization Remark Emitter
; CHECK-NEXT: Stack Frame Layout Analysis
+; CHECK-NEXT: Remove Loads Into Fake Uses
; CHECK-NEXT: LoongArch pseudo instruction expansion pass
; CHECK-NEXT: LoongArch atomic pseudo instruction expansion pass
; CHECK-NEXT: Lazy Machine Block Frequency Analysis
diff --git a/llvm/test/CodeGen/LoongArch/opt-pipeline.ll b/llvm/test/CodeGen/LoongArch/opt-pipeline.ll
index 50f154663177d0..7f48c4f4c40e2d 100644
--- a/llvm/test/CodeGen/LoongArch/opt-pipeline.ll
+++ b/llvm/test/CodeGen/LoongArch/opt-pipeline.ll
@@ -172,6 +172,7 @@
; LAXX-NEXT: Lazy Machine Block Frequency Analysis
; LAXX-NEXT: Machine Optimization Remark Emitter
; LAXX-NEXT: Stack Frame Layout Analysis
+; LAXX-NEXT: Remove Loads Into Fake Uses
; LAXX-NEXT: LoongArch pseudo instruction expansion pass
; LAXX-NEXT: LoongArch atomic pseudo instruction expansion pass
; LAXX-NEXT: Lazy Machine Block Frequency Analysis
diff --git a/llvm/test/CodeGen/PowerPC/O0-pipeline.ll b/llvm/test/CodeGen/PowerPC/O0-pipeline.ll
index 4a17384e499936..9238586639229d 100644
--- a/llvm/test/CodeGen/PowerPC/O0-pipeline.ll
+++ b/llvm/test/CodeGen/PowerPC/O0-pipeline.ll
@@ -66,6 +66,7 @@
; CHECK-NEXT: Lazy Machine Block Frequency Analysis
; CHECK-NEXT: Machine Optimization Remark Emitter
; CHECK-NEXT: Stack Frame Layout Analysis
+; CHECK-NEXT: Remove Loads Into Fake Uses
; CHECK-NEXT: PowerPC Expand Atomic
; CHECK-NEXT: PowerPC Branch Selector
; CHECK-NEXT: Lazy Machine Block Frequency Analysis
diff --git a/llvm/test/CodeGen/PowerPC/O3-pipeline.ll b/llvm/test/CodeGen/PowerPC/O3-pipeline.ll
index 39b23a57513d9d..195c015e7bd428 100644
--- a/llvm/test/CodeGen/PowerPC/O3-pipeline.ll
+++ b/llvm/test/CodeGen/PowerPC/O3-pipeline.ll
@@ -220,6 +220,7 @@
; CHECK-NEXT: Lazy Machine Block Frequency Analysis
; CHECK-NEXT: Machine Optimization Remark Emitter
; CHECK-NEXT: Stack Frame Layout Analysis
+; CHECK-NEXT: Remove Loads Into Fake Uses
; CHECK-NEXT: PowerPC Expand Atomic
; CHECK-NEXT: PowerPC Branch Selector
; CHECK-NEXT: Lazy Machine Block Frequency Analysis
diff --git a/llvm/test/CodeGen/RISCV/O0-pipeline.ll b/llvm/test/CodeGen/RISCV/O0-pipeline.ll
index 7473809a2c5d2e..d73c9c46818747 100644
--- a/llvm/test/CodeGen/RISCV/O0-pipeline.ll
+++ b/llvm/test/CodeGen/RISCV/O0-pipeline.ll
@@ -69,6 +69,7 @@
; CHECK-NEXT: Lazy Machine Block Frequency Analysis
; CHECK-NEXT: Machine Optimization Remark Emitter
; CHECK-NEXT: Stack Frame Layout Analysis
+; CHECK-NEXT: Remove Loads Into Fake Uses
; CHECK-NEXT: RISC-V Indirect Branch Tracking
; CHECK-NEXT: RISC-V pseudo instruction expansion pass
; CHECK-NEXT: RISC-V atomic pseudo instruction expansion pass
diff --git a/llvm/test/CodeGen/RISCV/O3-pipeline.ll b/llvm/test/CodeGen/RISCV/O3-pipeline.ll
index 44c270fdc3c257..6eb7f92983d2bf 100644
--- a/llvm/test/CodeGen/RISCV/O3-pipeline.ll
+++ b/llvm/test/CodeGen/RISCV/O3-pipeline.ll
@@ -197,6 +197,7 @@
; CHECK-NEXT: Lazy Machine Block Frequency Analysis
; CHECK-NEXT: Machine Optimization Remark Emitter
; CHECK-NEXT: Stack Frame Layout Analysis
+; CHECK-NEXT: Remove Loads Into Fake Uses
; CHECK-NEXT: RISC-V Zcmp move merging pass
; CHECK-NEXT: RISC-V Zcmp Push/Pop optimization pass
; CHECK-NEXT: RISC-V Indirect Branch Tracking
diff --git a/llvm/test/CodeGen/X86/O0-pipeline.ll b/llvm/test/CodeGen/X86/O0-pipeline.ll
index 98b86384b8443d..2bb320c1f30bad 100644
--- a/llvm/test/CodeGen/X86/O0-pipeline.ll
+++ b/llvm/test/CodeGen/X86/O0-pipeline.ll
@@ -77,6 +77,7 @@
; CHECK-NEXT: Lazy Machine Block Frequency Analysis
; CHECK-NEXT: Machine Optimization Remark Emitter
; CHECK-NEXT: Stack Frame Layout Analysis
+; CHECK-NEXT: Remove Loads Into Fake Uses
; CHECK-NEXT: X86 Speculative Execution Side Effect Suppression
; CHECK-NEXT: X86 Indirect Thunks
; CHECK-NEXT: X86 Return Thunks
diff --git a/llvm/test/CodeGen/X86/opt-pipeline.ll b/llvm/test/CodeGen/X86/opt-pipeline.ll
index 12c16a03b134c8..94e9c783a022ba 100644
--- a/llvm/test/CodeGen/X86/opt-pipeline.ll
+++ b/llvm/test/CodeGen/X86/opt-pipeline.ll
@@ -217,6 +217,7 @@
; CHECK-NEXT: Lazy Machine Block Frequency Analysis
; CHECK-NEXT: Machine Optimization Remark Emitter
; CHECK-NEXT: Stack Frame Layout Analysis
+; CHECK-NEXT: Remove Loads Into Fake Uses
; CHECK-NEXT: X86 Speculative Execution Side Effect Suppression
; CHECK-NEXT: X86 Indirect Thunks
; CHECK-NEXT: X86 Return Thunks
diff --git a/llvm/utils/gn/secondary/llvm/lib/CodeGen/BUILD.gn b/llvm/utils/gn/secondary/llvm/lib/CodeGen/BUILD.gn
index bbd1f9af65095a..3a4c7ea03b3aa9 100644
--- a/llvm/utils/gn/secondary/llvm/lib/CodeGen/BUILD.gn
+++ b/llvm/utils/gn/secondary/llvm/lib/CodeGen/BUILD.gn
@@ -201,6 +201,7 @@ static_library("CodeGen") {
"RegisterPressure.cpp",
"RegisterScavenging.cpp",
"RegisterUsageInfo.cpp",
+ "RemoveLoadsIntoFakeUses.cpp",
"RemoveRedundantDebugValues.cpp",
"RenameIndependentSubregs.cpp",
"ReplaceWithVeclib.cpp",
>From 8f219cf71e9e9be83f9f935290391f28956a7028 Mon Sep 17 00:00:00 2001
From: Stephen Tozer <stephen.tozer at sony.com>
Date: Fri, 7 Jun 2024 15:45:43 +0100
Subject: [PATCH 11/20] Remove printf debugging, move RemoveLoads to run before
LDV
---
llvm/lib/CodeGen/RemoveLoadsIntoFakeUses.cpp | 22 ++++++++------------
llvm/lib/CodeGen/TargetPassConfig.cpp | 3 +--
2 files changed, 10 insertions(+), 15 deletions(-)
diff --git a/llvm/lib/CodeGen/RemoveLoadsIntoFakeUses.cpp b/llvm/lib/CodeGen/RemoveLoadsIntoFakeUses.cpp
index 0777c774151cd6..6da391a297e35b 100644
--- a/llvm/lib/CodeGen/RemoveLoadsIntoFakeUses.cpp
+++ b/llvm/lib/CodeGen/RemoveLoadsIntoFakeUses.cpp
@@ -14,9 +14,11 @@
/// the stack and hence available for debugging. Therefore, this pass removes
/// loads that are only used by FAKE_USEs.
///
-/// This pass should run as late as possible, to ensure that we don't
-/// inadvertently shorten stack lifetimes by removing these loads, since the
-/// FAKE_USEs will also no longer be in effect.
+/// This pass should run very late, to ensure that we don't inadvertently
+/// shorten stack lifetimes by removing these loads, since the FAKE_USEs will
+/// also no longer be in effect. Running immediately before LiveDebugValues
+/// ensures that LDV will have accurate information of the machine location of
+/// debug values.
///
//===----------------------------------------------------------------------===//
@@ -69,12 +71,10 @@ INITIALIZE_PASS_END(RemoveLoadsIntoFakeUses, DEBUG_TYPE,
bool RemoveLoadsIntoFakeUses::runOnMachineFunction(MachineFunction &MF) {
// Only `optdebug` functions should contain FAKE_USEs, so don't try to run
// this for other functions.
- dbgs() << "We might run the pass!\n";
if (!MF.getFunction().hasFnAttribute(Attribute::OptimizeForDebugging) ||
skipFunction(MF.getFunction()))
return false;
- dbgs() << "We could run the pass!\n";
// This implementation assumes we are post-RA.
if (!MF.getProperties().hasProperty(
MachineFunctionProperties::Property::NoVRegs))
@@ -90,21 +90,17 @@ bool RemoveLoadsIntoFakeUses::runOnMachineFunction(MachineFunction &MF) {
SmallDenseMap<Register, SmallVector<MachineInstr *>> RegFakeUses;
LivePhysRegs.init(*TRI);
- dbgs() << "Running the pass!\n";
SmallVector<MachineInstr *, 16> Statepoints;
for (MachineBasicBlock *MBB : post_order(&MF)) {
LivePhysRegs.addLiveOuts(*MBB);
for (MachineInstr &MI : make_early_inc_range(reverse(*MBB))) {
if (MI.isFakeUse()) {
- dbgs() << "Seen fake use " << MI << "\n";
for (const MachineOperand &MO : MI.operands()) {
// Track the Fake Uses that use this register so that we can delete
// them if we delete the corresponding load.
if (MO.isReg()) {
- dbgs() << " Uses Reg: ";
TRI->dumpReg(MO.getReg());
- dbgs() << "\n";
RegFakeUses[MO.getReg()].push_back(&MI);
}
}
@@ -116,14 +112,11 @@ bool RemoveLoadsIntoFakeUses::runOnMachineFunction(MachineFunction &MF) {
// If the restore size is not std::nullopt then we are dealing with a
// reload of a spilled register.
if (MI.getRestoreSize(TII)) {
- dbgs() << "Seen restore of spill " << MI << "\n";
Register Reg = MI.getOperand(0).getReg();
assert(Reg.isPhysical() && "VReg seen in function with NoVRegs set?");
// Don't delete live physreg defs, or any reserved register defs.
if (!LivePhysRegs.available(Reg) || MRI->isReserved(Reg)) {
- dbgs() << "Reg is live: ";
TRI->dumpReg(Reg);
- dbgs() << "\n";
continue;
}
// There should be an exact match between the loaded register and the
@@ -140,8 +133,11 @@ bool RemoveLoadsIntoFakeUses::runOnMachineFunction(MachineFunction &MF) {
// Each FAKE_USE now appears to be a fake use of the previous value
// of the loaded register; delete them to avoid incorrectly
// interpreting them as such.
- for (MachineInstr *FakeUse : RegFakeUses[Reg])
+ for (MachineInstr *FakeUse : RegFakeUses[Reg]) {
+ LLVM_DEBUG(dbgs()
+ << "RemoveLoadsIntoFakeUses: DELETING: " << *FakeUse);
FakeUse->eraseFromParent();
+ }
NumFakeUsesDeleted += RegFakeUses[Reg].size();
RegFakeUses[Reg].clear();
}
diff --git a/llvm/lib/CodeGen/TargetPassConfig.cpp b/llvm/lib/CodeGen/TargetPassConfig.cpp
index b025ba7c750096..c0b834650d73b0 100644
--- a/llvm/lib/CodeGen/TargetPassConfig.cpp
+++ b/llvm/lib/CodeGen/TargetPassConfig.cpp
@@ -1206,6 +1206,7 @@ void TargetPassConfig::addMachinePasses() {
// addPreEmitPass. Maybe only pass "false" here for those targets?
addPass(&FuncletLayoutID);
+ addPass(&RemoveLoadsIntoFakeUsesID);
addPass(&StackMapLivenessID);
addPass(&LiveDebugValuesID);
addPass(&MachineSanitizerBinaryMetadataID);
@@ -1269,8 +1270,6 @@ void TargetPassConfig::addMachinePasses() {
PM->add(createStackFrameLayoutAnalysisPass());
- addPass(&RemoveLoadsIntoFakeUsesID);
-
// Add passes that directly emit MI after all other MI passes.
addPreEmitPass2();
>From 33e126305859a67721afb3fca543eb528c4fd08b Mon Sep 17 00:00:00 2001
From: Stephen Tozer <stephen.tozer at sony.com>
Date: Mon, 17 Jun 2024 18:15:54 +0100
Subject: [PATCH 12/20] Re-add pointer-escaping functionality
---
llvm/include/llvm/Analysis/PtrUseVisitor.h | 7 ++++++
llvm/test/Transforms/SROA/fake-use-escape.ll | 23 ++++++++++++++++++++
2 files changed, 30 insertions(+)
create mode 100644 llvm/test/Transforms/SROA/fake-use-escape.ll
diff --git a/llvm/include/llvm/Analysis/PtrUseVisitor.h b/llvm/include/llvm/Analysis/PtrUseVisitor.h
index e49ec454404bb5..459781028513a2 100644
--- a/llvm/include/llvm/Analysis/PtrUseVisitor.h
+++ b/llvm/include/llvm/Analysis/PtrUseVisitor.h
@@ -278,7 +278,14 @@ class PtrUseVisitor : protected InstVisitor<DerivedT>,
default:
return Base::visitIntrinsicInst(II);
+ // We escape pointers used by a fake_use to prevent SROA from transforming
+ // them.
+ // FIXME: This is only needed typed pointers; see if there's a better way to
+ // handle this.
case Intrinsic::fake_use:
+ PI.setEscaped(&II);
+ return;
+
case Intrinsic::lifetime_start:
case Intrinsic::lifetime_end:
return; // No-op intrinsics.
diff --git a/llvm/test/Transforms/SROA/fake-use-escape.ll b/llvm/test/Transforms/SROA/fake-use-escape.ll
new file mode 100644
index 00000000000000..c8f83186e0b43c
--- /dev/null
+++ b/llvm/test/Transforms/SROA/fake-use-escape.ll
@@ -0,0 +1,23 @@
+; RUN: opt -S -passes=sroa %s | FileCheck %s
+;
+;; Check that we do not assert and that we retain the fake_use instruction that
+;; uses the address of bar.
+;
+; CHECK: define{{.*}}foo
+; CHECK: call{{.*llvm\.fake\.use.*}}%bar.addr
+
+define void @_Z3fooPi(i32* %bar) {
+entry:
+ %bar.addr = alloca i32*, align 8
+ %baz = alloca i8**, align 8
+ store i32* %bar, i32** %bar.addr, align 8
+ %0 = bitcast i32** %bar.addr to i8**
+ store i8** %0, i8*** %baz, align 8
+ %1 = load i32*, i32** %bar.addr, align 8
+ call void (...) @llvm.fake.use(i32* %1)
+ %2 = load i8**, i8*** %baz, align 8
+ call void (...) @llvm.fake.use(i8** %2)
+ ret void
+}
+
+declare void @llvm.fake.use(...)
>From 6f5c13e2e58d345b2c93a005e11556738e673f3e Mon Sep 17 00:00:00 2001
From: Stephen Tozer <stephen.tozer at sony.com>
Date: Tue, 18 Jun 2024 10:08:43 +0100
Subject: [PATCH 13/20] Remove unused variable
---
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp | 1 -
1 file changed, 1 deletion(-)
diff --git a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
index 67f801c06f997b..19d23c8ba96783 100644
--- a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
@@ -1728,7 +1728,6 @@ void AsmPrinter::emitFunctionBody() {
bool HasAnyRealCode = false;
int NumInstsInFunction = 0;
bool IsEHa = MMI->getModule()->getModuleFlag("eh-asynch");
- const TargetInstrInfo *TII = MF->getSubtarget().getInstrInfo();
bool CanDoExtraAnalysis = ORE->allowExtraAnalysis(DEBUG_TYPE);
for (auto &MBB : *MF) {
>From 407abe1735e299e6478f47523bf5b8c75e171035 Mon Sep 17 00:00:00 2001
From: Stephen Tozer <stephen.tozer at sony.com>
Date: Fri, 16 Aug 2024 18:11:09 +0100
Subject: [PATCH 14/20] Use getRequiredProperties to check NoVRegs
---
llvm/lib/CodeGen/RemoveLoadsIntoFakeUses.cpp | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/llvm/lib/CodeGen/RemoveLoadsIntoFakeUses.cpp b/llvm/lib/CodeGen/RemoveLoadsIntoFakeUses.cpp
index 6da391a297e35b..aa491b776b5399 100644
--- a/llvm/lib/CodeGen/RemoveLoadsIntoFakeUses.cpp
+++ b/llvm/lib/CodeGen/RemoveLoadsIntoFakeUses.cpp
@@ -53,6 +53,11 @@ class RemoveLoadsIntoFakeUses : public MachineFunctionPass {
MachineFunctionPass::getAnalysisUsage(AU);
}
+ MachineFunctionProperties getRequiredProperties() const override {
+ return MachineFunctionProperties().set(
+ MachineFunctionProperties::Property::NoVRegs);
+ }
+
StringRef getPassName() const override {
return "Remove Loads Into Fake Uses";
}
@@ -75,11 +80,6 @@ bool RemoveLoadsIntoFakeUses::runOnMachineFunction(MachineFunction &MF) {
skipFunction(MF.getFunction()))
return false;
- // This implementation assumes we are post-RA.
- if (!MF.getProperties().hasProperty(
- MachineFunctionProperties::Property::NoVRegs))
- return false;
-
bool AnyChanges = false;
LiveRegUnits LivePhysRegs;
>From de496bcbf0dcfe189d735219dc5d65ebd7d4164b Mon Sep 17 00:00:00 2001
From: Stephen Tozer <stephen.tozer at sony.com>
Date: Fri, 16 Aug 2024 18:24:52 +0100
Subject: [PATCH 15/20] Remove debug prints
---
llvm/lib/CodeGen/RemoveLoadsIntoFakeUses.cpp | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/llvm/lib/CodeGen/RemoveLoadsIntoFakeUses.cpp b/llvm/lib/CodeGen/RemoveLoadsIntoFakeUses.cpp
index aa491b776b5399..232181a199b8c2 100644
--- a/llvm/lib/CodeGen/RemoveLoadsIntoFakeUses.cpp
+++ b/llvm/lib/CodeGen/RemoveLoadsIntoFakeUses.cpp
@@ -99,10 +99,8 @@ bool RemoveLoadsIntoFakeUses::runOnMachineFunction(MachineFunction &MF) {
for (const MachineOperand &MO : MI.operands()) {
// Track the Fake Uses that use this register so that we can delete
// them if we delete the corresponding load.
- if (MO.isReg()) {
- TRI->dumpReg(MO.getReg());
+ if (MO.isReg())
RegFakeUses[MO.getReg()].push_back(&MI);
- }
}
// Do not record FAKE_USE uses in LivePhysRegs so that we can recognize
// otherwise-unused loads.
@@ -115,10 +113,8 @@ bool RemoveLoadsIntoFakeUses::runOnMachineFunction(MachineFunction &MF) {
Register Reg = MI.getOperand(0).getReg();
assert(Reg.isPhysical() && "VReg seen in function with NoVRegs set?");
// Don't delete live physreg defs, or any reserved register defs.
- if (!LivePhysRegs.available(Reg) || MRI->isReserved(Reg)) {
- TRI->dumpReg(Reg);
+ if (!LivePhysRegs.available(Reg) || MRI->isReserved(Reg))
continue;
- }
// There should be an exact match between the loaded register and the
// FAKE_USE use. If not, this is a load that is unused by anything? It
// should probably be deleted, but that's outside of this pass' scope.
>From e61321f46897a8f4d2b835a89a42cf5b84a29e62 Mon Sep 17 00:00:00 2001
From: Stephen Tozer <stephen.tozer at sony.com>
Date: Fri, 16 Aug 2024 18:55:30 +0100
Subject: [PATCH 16/20] Update check-fake-use.py
---
.../DebugInfo/X86/Inputs/check-fake-use.py | 187 ++++++++----------
1 file changed, 85 insertions(+), 102 deletions(-)
diff --git a/llvm/test/DebugInfo/X86/Inputs/check-fake-use.py b/llvm/test/DebugInfo/X86/Inputs/check-fake-use.py
index 600671ad088bac..d57d02281fe331 100644
--- a/llvm/test/DebugInfo/X86/Inputs/check-fake-use.py
+++ b/llvm/test/DebugInfo/X86/Inputs/check-fake-use.py
@@ -1,119 +1,102 @@
#!/usr/bin/python3
# Parsing dwarfdump's output to determine whether the location list for the
-# parameter "b" covers all of the function. The script is written in form of a
-# state machine and expects that dwarfdump output adheres to a certain order:
-# 1) The .debug_info section must appear before the .debug_loc section.
-# 2) The DW_AT_location attribute must appear before the parameter's name in the
-# formal parameter DIE.
-#
+# parameter "b" covers all of the function. The script searches for information
+# in the input file to determine the [prologue, epilogue) range for the
+# function, the location list range for "b", and checks that the latter covers
+# the entirety of the former.
import re
import sys
-from enum import IntEnum, auto
-
DebugInfoPattern = r"\.debug_info contents:"
-SubprogramPattern = r"^0x[0-9a-f]+:\s+DW_TAG_subprogram"
+DebugLinePattern = r"\.debug_line contents:"
ProloguePattern = r"^\s*0x([0-9a-f]+)\s.+prologue_end"
EpiloguePattern = r"^\s*0x([0-9a-f]+)\s.+epilogue_begin"
FormalPattern = r"^0x[0-9a-f]+:\s+DW_TAG_formal_parameter"
-LocationPattern = r"DW_AT_location\s+\[DW_FORM_sec_offset\].*0x([a-f0-9]+)"
+LocationPattern = r"DW_AT_location\s+\[DW_FORM_([a-z_]+)\](?:.*0x([a-f0-9]+))"
DebugLocPattern = r'\[0x([a-f0-9]+),\s+0x([a-f0-9]+)\) ".text":'
-# States
-class States(IntEnum):
- LookingForDebugInfo = 0
- LookingForSubProgram = auto()
- LookingForFormal = auto()
- LookingForLocation = auto()
- DebugLocations = auto()
- LookingForPrologue = auto()
- LookingForEpilogue = auto()
- AllDone = auto()
-
-# For each state, the state table contains 3-item sublists with the following
-# entries:
-# 1) The regex pattern we use in each state.
-# 2) The state we enter when we have a successful match for the current pattern.
-# 3) The state we enter when we do not have a successful match for the
-# current pattern.
-StateTable = [
- # LookingForDebugInfo
- [DebugInfoPattern, States.LookingForSubProgram, States.LookingForDebugInfo],
- # LookingForSubProgram
- [SubprogramPattern, States.LookingForFormal, States.LookingForSubProgram],
- # LookingForFormal
- [FormalPattern, States.LookingForLocation, States.LookingForFormal],
- # LookingForLocation
- [LocationPattern, States.DebugLocations, States.LookingForFormal],
- # DebugLocations
- [DebugLocPattern, States.DebugLocations, States.LookingForPrologue],
- # LookingForPrologue
- [ProloguePattern, States.LookingForEpilogue, States.LookingForPrologue],
- # LookingForEpilogue
- [EpiloguePattern, States.AllDone, States.LookingForEpilogue],
- # AllDone
- [None, States.AllDone, States.AllDone],
-]
-
-# Symbolic indices
-StatePattern = 0
-NextState = 1
-FailState = 2
+SeenDebugInfo = False
+SeenDebugLine = False
+LocationRanges = None
+PrologueEnd = None
+EpilogueBegin = None
-State = States.LookingForDebugInfo
-FirstBeginOffset = -1
-ContinuousLocation = True
-LocationBreak = ()
-LocationRanges = []
-
-# Read output from file provided as command arg
+# The dwarfdump output should contain the DW_AT_location for "b" first, then the
+# line table which should contain prologue_end and epilogue_begin entries.
with open(sys.argv[1], "r") as dwarf_dump_file:
- for line in dwarf_dump_file:
- if State == States.AllDone:
- break
- Pattern = StateTable[State][StatePattern]
- m = re.search(Pattern, line)
- if m:
- # Match. Depending on the state, we extract various values.
- if State == States.LookingForPrologue:
- PrologueEnd = int(m.group(1), 16)
- elif State == States.LookingForEpilogue:
- EpilogueBegin = int(m.group(1), 16)
- elif State == States.DebugLocations:
- # Extract the range values
- if FirstBeginOffset == -1:
- FirstBeginOffset = int(m.group(1), 16)
- else:
- NewBeginOffset = int(m.group(1), 16)
- if NewBeginOffset != EndOffset:
- ContinuousLocation = False
- LocationBreak = (EndOffset, NewBeginOffset)
- EndOffset = int(m.group(2), 16)
- State = StateTable[State][NextState]
- else:
- State = StateTable[State][FailState]
+ dwarf_iter = iter(dwarf_dump_file)
+ for line in dwarf_iter:
+ if not SeenDebugInfo and re.match(DebugInfoPattern, line):
+ SeenDebugInfo = True
+ if not SeenDebugLine and re.match(DebugLinePattern, line):
+ SeenDebugLine = True
+ # Get the range of DW_AT_location for "b".
+ if LocationRanges is None:
+ if match := re.match(FormalPattern, line):
+ # Go until we either find DW_AT_location or reach the end of this entry.
+ location_match = None
+ while location_match is None:
+ if (line := next(dwarf_iter, "")) == "\n":
+ raise RuntimeError(
+ ".debug_info output is missing DW_AT_location for 'b'"
+ )
+ location_match = re.search(LocationPattern, line)
+ # Variable has whole-scope location, represented by an empty tuple.
+ if location_match.group(1) == "exprloc":
+ LocationRanges = ()
+ continue
+ if location_match.group(1) != "sec_offset":
+ raise RuntimeError(
+ f"Unhandled form for DW_AT_location: DW_FORM_{location_match.group(1)}"
+ )
+ # Variable has location range list.
+ if (
+ debug_loc_match := re.search(DebugLocPattern, next(dwarf_iter, ""))
+ ) is None:
+ raise RuntimeError(f"Invalid location range list for 'b'")
+ LocationRanges = (
+ int(debug_loc_match.group(1), 16),
+ int(debug_loc_match.group(2), 16),
+ )
+ while (
+ debug_loc_match := re.search(DebugLocPattern, next(dwarf_iter, ""))
+ ) is not None:
+ match_loc_start = int(debug_loc_match.group(1), 16)
+ match_loc_end = int(debug_loc_match.group(2), 16)
+ if match_loc_start != LocationRanges[1]:
+ raise RuntimeError(
+ f"Location list for 'b' is discontinuous from [0x{LocationRanges[1]:x}, 0x{match_loc_start:x})"
+ )
+ LocationRanges = (LocationRanges[0], match_loc_end)
+ # Get the prologue_end address.
+ elif PrologueEnd is None:
+ if match := re.match(ProloguePattern, line):
+ PrologueEnd = int(match.group(1), 16)
+ # Get the epilogue_begin address.
+ elif EpilogueBegin is None:
+ if match := re.match(EpiloguePattern, line):
+ EpilogueBegin = int(match.group(1), 16)
+ break
-Success = True
+if not SeenDebugInfo:
+ raise RuntimeError(".debug_info section not found.")
+if not SeenDebugLine:
+ raise RuntimeError(".debug_line section not found.")
-# Check that the first entry start with 0 and that the last ending address
-# in our location list is close to the high pc of the subprogram.
-if State != States.AllDone:
- print("Error in expected sequence of DWARF information:")
- print(" State = %d\n" % State)
- Success = False
-elif FirstBeginOffset == -1:
- print("Location list for 'b' not found, did the debug info format change?")
- Success = False
-elif not ContinuousLocation:
- print("Location list for 'b' is discontinuous from [0x%x, 0x%x)" % LocationBreak)
- Success = False
-elif FirstBeginOffset > PrologueEnd or EndOffset < EpilogueBegin:
- print("Location list for 'b' does not cover the whole function:")
- print(
- "Prologue to Epilogue = [0x%x, 0x%x), Location range = [0x%x, 0x%x)"
- % (PrologueEnd, EpilogueBegin, FirstBeginOffset, EndOffset)
- )
- Success = False
+if LocationRanges is None:
+ raise RuntimeError(".debug_info output is missing parameter 'b'")
+if PrologueEnd is None:
+ raise RuntimeError(".debug_line output is missing prologue_end")
+if EpilogueBegin is None:
+ raise RuntimeError(".debug_line output is missing epilogue_begin")
-sys.exit(not Success)
+if len(LocationRanges) == 2 and (
+ LocationRanges[0] > PrologueEnd or LocationRanges[1] < EpilogueBegin
+):
+ raise RuntimeError(
+ f"""Location list for 'b' does not cover the whole function:")
+ Prologue to Epilogue = [0x{PrologueEnd:x}, 0x{EpilogueBegin:x})
+ Location range = [0x{LocationRanges[0]:x}, 0x{LocationRanges[1]:x})
+"""
+ )
>From b41a690eab9bc926e01c776c52778573279c8836 Mon Sep 17 00:00:00 2001
From: Stephen Tozer <stephen.tozer at sony.com>
Date: Mon, 19 Aug 2024 12:40:44 +0100
Subject: [PATCH 17/20] Rebase fixes
- Correct check order in pipeline tests
- Disable new pass for targets that keep VRegs
- Remove comment stating that ptr escaping is unused for opaque pts
- Fix-up some minor issues with the fake-use tests
---
llvm/include/llvm/Analysis/PtrUseVisitor.h | 2 --
llvm/lib/Target/NVPTX/NVPTXTargetMachine.cpp | 1 +
llvm/lib/Target/SPIRV/SPIRVTargetMachine.cpp | 1 +
.../WebAssembly/WebAssemblyTargetMachine.cpp | 1 +
.../ScalarEvolution/flags-from-poison-dbg.ll | 2 +-
llvm/test/CodeGen/AArch64/O0-pipeline.ll | 2 +-
llvm/test/CodeGen/AArch64/O3-pipeline.ll | 2 +-
llvm/test/CodeGen/AMDGPU/llc-pipeline.ll | 10 +++++-----
llvm/test/CodeGen/ARM/O3-pipeline.ll | 2 +-
llvm/test/CodeGen/LoongArch/O0-pipeline.ll | 2 +-
llvm/test/CodeGen/LoongArch/opt-pipeline.ll | 2 +-
llvm/test/CodeGen/PowerPC/O0-pipeline.ll | 2 +-
llvm/test/CodeGen/PowerPC/O3-pipeline.ll | 2 +-
llvm/test/CodeGen/RISCV/O0-pipeline.ll | 2 +-
llvm/test/CodeGen/RISCV/O3-pipeline.ll | 2 +-
llvm/test/CodeGen/X86/O0-pipeline.ll | 2 +-
llvm/test/CodeGen/X86/fake-use-vector.ll | 6 +++---
llvm/test/CodeGen/X86/opt-pipeline.ll | 2 +-
llvm/test/DebugInfo/X86/fake-use.ll | 2 +-
llvm/test/Transforms/SROA/fake-use-escape.ll | 20 +++++++++----------
20 files changed, 33 insertions(+), 34 deletions(-)
diff --git a/llvm/include/llvm/Analysis/PtrUseVisitor.h b/llvm/include/llvm/Analysis/PtrUseVisitor.h
index 459781028513a2..f5c23b1b4e014d 100644
--- a/llvm/include/llvm/Analysis/PtrUseVisitor.h
+++ b/llvm/include/llvm/Analysis/PtrUseVisitor.h
@@ -280,8 +280,6 @@ class PtrUseVisitor : protected InstVisitor<DerivedT>,
// We escape pointers used by a fake_use to prevent SROA from transforming
// them.
- // FIXME: This is only needed typed pointers; see if there's a better way to
- // handle this.
case Intrinsic::fake_use:
PI.setEscaped(&II);
return;
diff --git a/llvm/lib/Target/NVPTX/NVPTXTargetMachine.cpp b/llvm/lib/Target/NVPTX/NVPTXTargetMachine.cpp
index 097e29527eed9f..e86d3771bd2f26 100644
--- a/llvm/lib/Target/NVPTX/NVPTXTargetMachine.cpp
+++ b/llvm/lib/Target/NVPTX/NVPTXTargetMachine.cpp
@@ -315,6 +315,7 @@ void NVPTXPassConfig::addIRPasses() {
disablePass(&FuncletLayoutID);
disablePass(&PatchableFunctionID);
disablePass(&ShrinkWrapID);
+ disablePass(&RemoveLoadsIntoFakeUsesID);
addPass(createNVPTXAAWrapperPass());
addPass(createExternalAAWrapperPass([](Pass &P, Function &, AAResults &AAR) {
diff --git a/llvm/lib/Target/SPIRV/SPIRVTargetMachine.cpp b/llvm/lib/Target/SPIRV/SPIRVTargetMachine.cpp
index 48a2ce89bad390..7058b15d53aa0b 100644
--- a/llvm/lib/Target/SPIRV/SPIRVTargetMachine.cpp
+++ b/llvm/lib/Target/SPIRV/SPIRVTargetMachine.cpp
@@ -140,6 +140,7 @@ void SPIRVPassConfig::addPostRegAlloc() {
disablePass(&ShrinkWrapID);
disablePass(&LiveDebugValuesID);
disablePass(&MachineLateInstrsCleanupID);
+ disablePass(&RemoveLoadsIntoFakeUsesID);
// Do not work with OpPhi.
disablePass(&BranchFolderPassID);
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyTargetMachine.cpp b/llvm/lib/Target/WebAssembly/WebAssemblyTargetMachine.cpp
index 23539a5f4b26f1..73765f8fa0092c 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyTargetMachine.cpp
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyTargetMachine.cpp
@@ -552,6 +552,7 @@ void WebAssemblyPassConfig::addPostRegAlloc() {
disablePass(&StackMapLivenessID);
disablePass(&PatchableFunctionID);
disablePass(&ShrinkWrapID);
+ disablePass(&RemoveLoadsIntoFakeUsesID);
// This pass hurts code size for wasm because it can generate irreducible
// control flow.
diff --git a/llvm/test/Analysis/ScalarEvolution/flags-from-poison-dbg.ll b/llvm/test/Analysis/ScalarEvolution/flags-from-poison-dbg.ll
index 2370fe1468b4ed..5d304d1569d3f6 100644
--- a/llvm/test/Analysis/ScalarEvolution/flags-from-poison-dbg.ll
+++ b/llvm/test/Analysis/ScalarEvolution/flags-from-poison-dbg.ll
@@ -16,7 +16,7 @@ for.body.lr.ph: ; preds = %entry
for.body: ; preds = %for.inc, %for.body.lr.ph
%i.02 = phi i32 [ 0, %for.body.lr.ph ], [ %inc, %for.inc ]
%add = add nsw i32 %i.02, 50, !dbg !16
- call void @llvm.dbg.value(metadata i32 %add, i64 0, metadata !18, metadata !19), !dbg !20
+ tail call void @llvm.dbg.value(metadata i32 %add, i64 0, metadata !18, metadata !19), !dbg !20
%idxprom = sext i32 %add to i64, !dbg !21
; CHECK: %idxprom = sext i32 %add to i64
diff --git a/llvm/test/CodeGen/AArch64/O0-pipeline.ll b/llvm/test/CodeGen/AArch64/O0-pipeline.ll
index 83e9751459070a..ec6f24a5650fa3 100644
--- a/llvm/test/CodeGen/AArch64/O0-pipeline.ll
+++ b/llvm/test/CodeGen/AArch64/O0-pipeline.ll
@@ -69,6 +69,7 @@
; CHECK-NEXT: Implement the 'patchable-function' attribute
; CHECK-NEXT: Workaround A53 erratum 835769 pass
; CHECK-NEXT: Contiguously Lay Out Funclets
+; CHECK-NEXT: Remove Loads Into Fake Uses
; CHECK-NEXT: StackMap Liveness Analysis
; CHECK-NEXT: Live DEBUG_VALUE analysis
; CHECK-NEXT: Machine Sanitizer Binary Metadata
@@ -80,7 +81,6 @@
; CHECK-NEXT: Lazy Machine Block Frequency Analysis
; CHECK-NEXT: Machine Optimization Remark Emitter
; CHECK-NEXT: Stack Frame Layout Analysis
-; CHECK-NEXT: Remove Loads Into Fake Uses
; CHECK-NEXT: Unpack machine instruction bundles
; CHECK-NEXT: Lazy Machine Block Frequency Analysis
; CHECK-NEXT: Machine Optimization Remark Emitter
diff --git a/llvm/test/CodeGen/AArch64/O3-pipeline.ll b/llvm/test/CodeGen/AArch64/O3-pipeline.ll
index 75b109ad1e70c7..ffbe3dd377109f 100644
--- a/llvm/test/CodeGen/AArch64/O3-pipeline.ll
+++ b/llvm/test/CodeGen/AArch64/O3-pipeline.ll
@@ -224,6 +224,7 @@
; CHECK-NEXT: Machine Copy Propagation Pass
; CHECK-NEXT: Workaround A53 erratum 835769 pass
; CHECK-NEXT: Contiguously Lay Out Funclets
+; CHECK-NEXT: Remove Loads Into Fake Uses
; CHECK-NEXT: StackMap Liveness Analysis
; CHECK-NEXT: Live DEBUG_VALUE analysis
; CHECK-NEXT: Machine Sanitizer Binary Metadata
@@ -238,7 +239,6 @@
; CHECK-NEXT: Lazy Machine Block Frequency Analysis
; CHECK-NEXT: Machine Optimization Remark Emitter
; CHECK-NEXT: Stack Frame Layout Analysis
-; CHECK-NEXT: Remove Loads Into Fake Uses
; CHECK-NEXT: Unpack machine instruction bundles
; CHECK-NEXT: Lazy Machine Block Frequency Analysis
; CHECK-NEXT: Machine Optimization Remark Emitter
diff --git a/llvm/test/CodeGen/AMDGPU/llc-pipeline.ll b/llvm/test/CodeGen/AMDGPU/llc-pipeline.ll
index a69463c10413f7..7bf1b8746fd87b 100644
--- a/llvm/test/CodeGen/AMDGPU/llc-pipeline.ll
+++ b/llvm/test/CodeGen/AMDGPU/llc-pipeline.ll
@@ -143,12 +143,12 @@
; GCN-O0-NEXT: Post RA hazard recognizer
; GCN-O0-NEXT: Branch relaxation pass
; GCN-O0-NEXT: Register Usage Information Collector Pass
+; GCN-O0-NEXT: Remove Loads Into Fake Uses
; GCN-O0-NEXT: Live DEBUG_VALUE analysis
; GCN-O0-NEXT: Machine Sanitizer Binary Metadata
; GCN-O0-NEXT: Lazy Machine Block Frequency Analysis
; GCN-O0-NEXT: Machine Optimization Remark Emitter
; GCN-O0-NEXT: Stack Frame Layout Analysis
-; GCN-O0-NEXT: Remove Loads Into Fake Uses
; GCN-O0-NEXT: Function register usage analysis
; GCN-O0-NEXT: FunctionPass Manager
; GCN-O0-NEXT: Lazy Machine Block Frequency Analysis
@@ -421,12 +421,12 @@
; GCN-O1-NEXT: AMDGPU Insert Delay ALU
; GCN-O1-NEXT: Branch relaxation pass
; GCN-O1-NEXT: Register Usage Information Collector Pass
+; GCN-O1-NEXT: Remove Loads Into Fake Uses
; GCN-O1-NEXT: Live DEBUG_VALUE analysis
; GCN-O1-NEXT: Machine Sanitizer Binary Metadata
; GCN-O1-NEXT: Lazy Machine Block Frequency Analysis
; GCN-O1-NEXT: Machine Optimization Remark Emitter
; GCN-O1-NEXT: Stack Frame Layout Analysis
-; GCN-O1-NEXT: Remove Loads Into Fake Uses
; GCN-O1-NEXT: Function register usage analysis
; GCN-O1-NEXT: FunctionPass Manager
; GCN-O1-NEXT: Lazy Machine Block Frequency Analysis
@@ -727,12 +727,12 @@
; GCN-O1-OPTS-NEXT: AMDGPU Insert Delay ALU
; GCN-O1-OPTS-NEXT: Branch relaxation pass
; GCN-O1-OPTS-NEXT: Register Usage Information Collector Pass
+; GCN-O1-OPTS-NEXT: Remove Loads Into Fake Uses
; GCN-O1-OPTS-NEXT: Live DEBUG_VALUE analysis
; GCN-O1-OPTS-NEXT: Machine Sanitizer Binary Metadata
; GCN-O1-OPTS-NEXT: Lazy Machine Block Frequency Analysis
; GCN-O1-OPTS-NEXT: Machine Optimization Remark Emitter
; GCN-O1-OPTS-NEXT: Stack Frame Layout Analysis
-; GCN-O1-OPTS-NEXT: Remove Loads Into Fake Uses
; GCN-O1-OPTS-NEXT: Function register usage analysis
; GCN-O1-OPTS-NEXT: FunctionPass Manager
; GCN-O1-OPTS-NEXT: Lazy Machine Block Frequency Analysis
@@ -1039,12 +1039,12 @@
; GCN-O2-NEXT: AMDGPU Insert Delay ALU
; GCN-O2-NEXT: Branch relaxation pass
; GCN-O2-NEXT: Register Usage Information Collector Pass
+; GCN-O2-NEXT: Remove Loads Into Fake Uses
; GCN-O2-NEXT: Live DEBUG_VALUE analysis
; GCN-O2-NEXT: Machine Sanitizer Binary Metadata
; GCN-O2-NEXT: Lazy Machine Block Frequency Analysis
; GCN-O2-NEXT: Machine Optimization Remark Emitter
; GCN-O2-NEXT: Stack Frame Layout Analysis
-; GCN-O2-NEXT: Remove Loads Into Fake Uses
; GCN-O2-NEXT: Function register usage analysis
; GCN-O2-NEXT: FunctionPass Manager
; GCN-O2-NEXT: Lazy Machine Block Frequency Analysis
@@ -1363,12 +1363,12 @@
; GCN-O3-NEXT: AMDGPU Insert Delay ALU
; GCN-O3-NEXT: Branch relaxation pass
; GCN-O3-NEXT: Register Usage Information Collector Pass
+; GCN-O3-NEXT: Remove Loads Into Fake Uses
; GCN-O3-NEXT: Live DEBUG_VALUE analysis
; GCN-O3-NEXT: Machine Sanitizer Binary Metadata
; GCN-O3-NEXT: Lazy Machine Block Frequency Analysis
; GCN-O3-NEXT: Machine Optimization Remark Emitter
; GCN-O3-NEXT: Stack Frame Layout Analysis
-; GCN-O3-NEXT: Remove Loads Into Fake Uses
; GCN-O3-NEXT: Function register usage analysis
; GCN-O3-NEXT: FunctionPass Manager
; GCN-O3-NEXT: Lazy Machine Block Frequency Analysis
diff --git a/llvm/test/CodeGen/ARM/O3-pipeline.ll b/llvm/test/CodeGen/ARM/O3-pipeline.ll
index 96a13d1b50c9a1..819623d3fcc5a3 100644
--- a/llvm/test/CodeGen/ARM/O3-pipeline.ll
+++ b/llvm/test/CodeGen/ARM/O3-pipeline.ll
@@ -193,6 +193,7 @@
; CHECK-NEXT: ARM block placement
; CHECK-NEXT: optimise barriers pass
; CHECK-NEXT: Contiguously Lay Out Funclets
+; CHECK-NEXT: Remove Loads Into Fake Uses
; CHECK-NEXT: StackMap Liveness Analysis
; CHECK-NEXT: Live DEBUG_VALUE analysis
; CHECK-NEXT: Machine Sanitizer Binary Metadata
@@ -201,7 +202,6 @@
; CHECK-NEXT: Lazy Machine Block Frequency Analysis
; CHECK-NEXT: Machine Optimization Remark Emitter
; CHECK-NEXT: Stack Frame Layout Analysis
-; CHECK-NEXT: Remove Loads Into Fake Uses
; CHECK-NEXT: ReachingDefAnalysis
; CHECK-NEXT: ARM fix for Cortex-A57 AES Erratum 1742098
; CHECK-NEXT: ARM Branch Targets
diff --git a/llvm/test/CodeGen/LoongArch/O0-pipeline.ll b/llvm/test/CodeGen/LoongArch/O0-pipeline.ll
index 6cae56c5c3a18b..24bd4c75a9821e 100644
--- a/llvm/test/CodeGen/LoongArch/O0-pipeline.ll
+++ b/llvm/test/CodeGen/LoongArch/O0-pipeline.ll
@@ -62,13 +62,13 @@
; CHECK-NEXT: Implement the 'patchable-function' attribute
; CHECK-NEXT: Branch relaxation pass
; CHECK-NEXT: Contiguously Lay Out Funclets
+; CHECK-NEXT: Remove Loads Into Fake Uses
; CHECK-NEXT: StackMap Liveness Analysis
; CHECK-NEXT: Live DEBUG_VALUE analysis
; CHECK-NEXT: Machine Sanitizer Binary Metadata
; CHECK-NEXT: Lazy Machine Block Frequency Analysis
; CHECK-NEXT: Machine Optimization Remark Emitter
; CHECK-NEXT: Stack Frame Layout Analysis
-; CHECK-NEXT: Remove Loads Into Fake Uses
; CHECK-NEXT: LoongArch pseudo instruction expansion pass
; CHECK-NEXT: LoongArch atomic pseudo instruction expansion pass
; CHECK-NEXT: Lazy Machine Block Frequency Analysis
diff --git a/llvm/test/CodeGen/LoongArch/opt-pipeline.ll b/llvm/test/CodeGen/LoongArch/opt-pipeline.ll
index 7f48c4f4c40e2d..53cdbd18f9b907 100644
--- a/llvm/test/CodeGen/LoongArch/opt-pipeline.ll
+++ b/llvm/test/CodeGen/LoongArch/opt-pipeline.ll
@@ -166,13 +166,13 @@
; LAXX-NEXT: Implement the 'patchable-function' attribute
; LAXX-NEXT: Branch relaxation pass
; LAXX-NEXT: Contiguously Lay Out Funclets
+; LAXX-NEXT: Remove Loads Into Fake Uses
; LAXX-NEXT: StackMap Liveness Analysis
; LAXX-NEXT: Live DEBUG_VALUE analysis
; LAXX-NEXT: Machine Sanitizer Binary Metadata
; LAXX-NEXT: Lazy Machine Block Frequency Analysis
; LAXX-NEXT: Machine Optimization Remark Emitter
; LAXX-NEXT: Stack Frame Layout Analysis
-; LAXX-NEXT: Remove Loads Into Fake Uses
; LAXX-NEXT: LoongArch pseudo instruction expansion pass
; LAXX-NEXT: LoongArch atomic pseudo instruction expansion pass
; LAXX-NEXT: Lazy Machine Block Frequency Analysis
diff --git a/llvm/test/CodeGen/PowerPC/O0-pipeline.ll b/llvm/test/CodeGen/PowerPC/O0-pipeline.ll
index 9238586639229d..5853647bf3b9f3 100644
--- a/llvm/test/CodeGen/PowerPC/O0-pipeline.ll
+++ b/llvm/test/CodeGen/PowerPC/O0-pipeline.ll
@@ -60,13 +60,13 @@
; CHECK-NEXT: Implement the 'patchable-function' attribute
; CHECK-NEXT: PowerPC Pre-Emit Peephole
; CHECK-NEXT: Contiguously Lay Out Funclets
+; CHECK-NEXT: Remove Loads Into Fake Uses
; CHECK-NEXT: StackMap Liveness Analysis
; CHECK-NEXT: Live DEBUG_VALUE analysis
; CHECK-NEXT: Machine Sanitizer Binary Metadata
; CHECK-NEXT: Lazy Machine Block Frequency Analysis
; CHECK-NEXT: Machine Optimization Remark Emitter
; CHECK-NEXT: Stack Frame Layout Analysis
-; CHECK-NEXT: Remove Loads Into Fake Uses
; CHECK-NEXT: PowerPC Expand Atomic
; CHECK-NEXT: PowerPC Branch Selector
; CHECK-NEXT: Lazy Machine Block Frequency Analysis
diff --git a/llvm/test/CodeGen/PowerPC/O3-pipeline.ll b/llvm/test/CodeGen/PowerPC/O3-pipeline.ll
index 195c015e7bd428..21bd4bb8502c3d 100644
--- a/llvm/test/CodeGen/PowerPC/O3-pipeline.ll
+++ b/llvm/test/CodeGen/PowerPC/O3-pipeline.ll
@@ -214,13 +214,13 @@
; CHECK-NEXT: PowerPC Pre-Emit Peephole
; CHECK-NEXT: PowerPC Early-Return Creation
; CHECK-NEXT: Contiguously Lay Out Funclets
+; CHECK-NEXT: Remove Loads Into Fake Uses
; CHECK-NEXT: StackMap Liveness Analysis
; CHECK-NEXT: Live DEBUG_VALUE analysis
; CHECK-NEXT: Machine Sanitizer Binary Metadata
; CHECK-NEXT: Lazy Machine Block Frequency Analysis
; CHECK-NEXT: Machine Optimization Remark Emitter
; CHECK-NEXT: Stack Frame Layout Analysis
-; CHECK-NEXT: Remove Loads Into Fake Uses
; CHECK-NEXT: PowerPC Expand Atomic
; CHECK-NEXT: PowerPC Branch Selector
; CHECK-NEXT: Lazy Machine Block Frequency Analysis
diff --git a/llvm/test/CodeGen/RISCV/O0-pipeline.ll b/llvm/test/CodeGen/RISCV/O0-pipeline.ll
index d73c9c46818747..84c7f3f987c065 100644
--- a/llvm/test/CodeGen/RISCV/O0-pipeline.ll
+++ b/llvm/test/CodeGen/RISCV/O0-pipeline.ll
@@ -63,13 +63,13 @@
; CHECK-NEXT: Branch relaxation pass
; CHECK-NEXT: RISC-V Make Compressible
; CHECK-NEXT: Contiguously Lay Out Funclets
+; CHECK-NEXT: Remove Loads Into Fake Uses
; CHECK-NEXT: StackMap Liveness Analysis
; CHECK-NEXT: Live DEBUG_VALUE analysis
; CHECK-NEXT: Machine Sanitizer Binary Metadata
; CHECK-NEXT: Lazy Machine Block Frequency Analysis
; CHECK-NEXT: Machine Optimization Remark Emitter
; CHECK-NEXT: Stack Frame Layout Analysis
-; CHECK-NEXT: Remove Loads Into Fake Uses
; CHECK-NEXT: RISC-V Indirect Branch Tracking
; CHECK-NEXT: RISC-V pseudo instruction expansion pass
; CHECK-NEXT: RISC-V atomic pseudo instruction expansion pass
diff --git a/llvm/test/CodeGen/RISCV/O3-pipeline.ll b/llvm/test/CodeGen/RISCV/O3-pipeline.ll
index 6eb7f92983d2bf..5d14d14d216244 100644
--- a/llvm/test/CodeGen/RISCV/O3-pipeline.ll
+++ b/llvm/test/CodeGen/RISCV/O3-pipeline.ll
@@ -189,6 +189,7 @@
; CHECK-NEXT: Branch relaxation pass
; CHECK-NEXT: RISC-V Make Compressible
; CHECK-NEXT: Contiguously Lay Out Funclets
+; CHECK-NEXT: Remove Loads Into Fake Uses
; CHECK-NEXT: StackMap Liveness Analysis
; CHECK-NEXT: Live DEBUG_VALUE analysis
; CHECK-NEXT: Machine Sanitizer Binary Metadata
@@ -197,7 +198,6 @@
; CHECK-NEXT: Lazy Machine Block Frequency Analysis
; CHECK-NEXT: Machine Optimization Remark Emitter
; CHECK-NEXT: Stack Frame Layout Analysis
-; CHECK-NEXT: Remove Loads Into Fake Uses
; CHECK-NEXT: RISC-V Zcmp move merging pass
; CHECK-NEXT: RISC-V Zcmp Push/Pop optimization pass
; CHECK-NEXT: RISC-V Indirect Branch Tracking
diff --git a/llvm/test/CodeGen/X86/O0-pipeline.ll b/llvm/test/CodeGen/X86/O0-pipeline.ll
index 2bb320c1f30bad..4c99dd830b442e 100644
--- a/llvm/test/CodeGen/X86/O0-pipeline.ll
+++ b/llvm/test/CodeGen/X86/O0-pipeline.ll
@@ -71,13 +71,13 @@
; CHECK-NEXT: X86 Insert Cache Prefetches
; CHECK-NEXT: X86 insert wait instruction
; CHECK-NEXT: Contiguously Lay Out Funclets
+; CHECK-NEXT: Remove Loads Into Fake Uses
; CHECK-NEXT: StackMap Liveness Analysis
; CHECK-NEXT: Live DEBUG_VALUE analysis
; CHECK-NEXT: Machine Sanitizer Binary Metadata
; CHECK-NEXT: Lazy Machine Block Frequency Analysis
; CHECK-NEXT: Machine Optimization Remark Emitter
; CHECK-NEXT: Stack Frame Layout Analysis
-; CHECK-NEXT: Remove Loads Into Fake Uses
; CHECK-NEXT: X86 Speculative Execution Side Effect Suppression
; CHECK-NEXT: X86 Indirect Thunks
; CHECK-NEXT: X86 Return Thunks
diff --git a/llvm/test/CodeGen/X86/fake-use-vector.ll b/llvm/test/CodeGen/X86/fake-use-vector.ll
index f320ee5e6d766b..cb46ccc8cac11c 100644
--- a/llvm/test/CodeGen/X86/fake-use-vector.ll
+++ b/llvm/test/CodeGen/X86/fake-use-vector.ll
@@ -23,15 +23,15 @@ target triple = "x86_64-unknown-unknown"
define <4 x float> @_Z3runDv4_fDv1_x(<4 x float> %r, i64 %b.coerce) local_unnamed_addr #0 {
entry:
%0 = insertelement <1 x i64> undef, i64 %b.coerce, i32 0
- %1 = bitcast i64 %b.coerce to x86_mmx
- %2 = tail call <4 x float> @llvm.x86.sse.cvtpi2ps(<4 x float> %r, x86_mmx %1)
+ %1 = bitcast i64 %b.coerce to <1 x i64>
+ %2 = tail call <4 x float> @llvm.x86.sse.cvtpi2ps(<4 x float> %r, <1 x i64> %1)
tail call void (...) @llvm.fake.use(<1 x i64> %0)
tail call void (...) @llvm.fake.use(<4 x float> %r)
ret <4 x float> %2
}
; Function Attrs: nounwind readnone
-declare <4 x float> @llvm.x86.sse.cvtpi2ps(<4 x float>, x86_mmx)
+declare <4 x float> @llvm.x86.sse.cvtpi2ps(<4 x float>, <1 x i64>)
; Function Attrs: nounwind
declare void @llvm.fake.use(...)
diff --git a/llvm/test/CodeGen/X86/opt-pipeline.ll b/llvm/test/CodeGen/X86/opt-pipeline.ll
index 94e9c783a022ba..545640b7661691 100644
--- a/llvm/test/CodeGen/X86/opt-pipeline.ll
+++ b/llvm/test/CodeGen/X86/opt-pipeline.ll
@@ -211,13 +211,13 @@
; CHECK-NEXT: X86 Insert Cache Prefetches
; CHECK-NEXT: X86 insert wait instruction
; CHECK-NEXT: Contiguously Lay Out Funclets
+; CHECK-NEXT: Remove Loads Into Fake Uses
; CHECK-NEXT: StackMap Liveness Analysis
; CHECK-NEXT: Live DEBUG_VALUE analysis
; CHECK-NEXT: Machine Sanitizer Binary Metadata
; CHECK-NEXT: Lazy Machine Block Frequency Analysis
; CHECK-NEXT: Machine Optimization Remark Emitter
; CHECK-NEXT: Stack Frame Layout Analysis
-; CHECK-NEXT: Remove Loads Into Fake Uses
; CHECK-NEXT: X86 Speculative Execution Side Effect Suppression
; CHECK-NEXT: X86 Indirect Thunks
; CHECK-NEXT: X86 Return Thunks
diff --git a/llvm/test/DebugInfo/X86/fake-use.ll b/llvm/test/DebugInfo/X86/fake-use.ll
index 9dbe2a4357f110..e1357d5bfb1364 100644
--- a/llvm/test/DebugInfo/X86/fake-use.ll
+++ b/llvm/test/DebugInfo/X86/fake-use.ll
@@ -41,7 +41,7 @@ source_filename = "t2.c"
; Function Attrs: nounwind sspstrong uwtable
define i32 @foo(i32 %b, i32 %i) local_unnamed_addr optdebug !dbg !13 {
entry:
- tail call void @llvm.dbg.value(metadata i32 %b, i64 0, metadata !17, metadata !20), !dbg !21
+ #dbg_value(i32 %b, !17, !20, !21)
%c = add i32 %b, 42
%tobool = icmp sgt i32 %c, 2, !dbg !27
tail call void asm sideeffect "", "~{rax},~{rbx},~{rcx},~{rdx},~{rsi},~{rdi},~{rbp},~{r8},~{r9},~{r10},~{r11},~{r12},~{r13},~{r14},~{r15},~{dirflag},~{fpsr},~{flags}"()
diff --git a/llvm/test/Transforms/SROA/fake-use-escape.ll b/llvm/test/Transforms/SROA/fake-use-escape.ll
index c8f83186e0b43c..5429d09740e522 100644
--- a/llvm/test/Transforms/SROA/fake-use-escape.ll
+++ b/llvm/test/Transforms/SROA/fake-use-escape.ll
@@ -4,19 +4,17 @@
;; uses the address of bar.
;
; CHECK: define{{.*}}foo
-; CHECK: call{{.*llvm\.fake\.use.*}}%bar.addr
+; CHECK: call{{.*llvm\.fake\.use.*}}(ptr %bar.addr)
-define void @_Z3fooPi(i32* %bar) {
+define void @_Z3fooPi(ptr %bar) {
entry:
- %bar.addr = alloca i32*, align 8
- %baz = alloca i8**, align 8
- store i32* %bar, i32** %bar.addr, align 8
- %0 = bitcast i32** %bar.addr to i8**
- store i8** %0, i8*** %baz, align 8
- %1 = load i32*, i32** %bar.addr, align 8
- call void (...) @llvm.fake.use(i32* %1)
- %2 = load i8**, i8*** %baz, align 8
- call void (...) @llvm.fake.use(i8** %2)
+ %bar.addr = alloca ptr, align 8
+ %baz = alloca ptr, align 8
+ store ptr %bar, ptr %bar.addr, align 8
+ store ptr %bar.addr, ptr %baz, align 8
+ %0 = load ptr, ptr %bar.addr, align 8
+ %1 = load ptr, ptr %baz, align 8
+ call void (...) @llvm.fake.use(ptr %1)
ret void
}
>From 56f3f55507d11c0645f6d3aa9c4cdf32fd1388fd Mon Sep 17 00:00:00 2001
From: Stephen Tozer <stephen.tozer at sony.com>
Date: Wed, 28 Aug 2024 12:07:37 +0100
Subject: [PATCH 18/20] Correctly append to CallInsts
---
llvm/lib/CodeGen/CodeGenPrepare.cpp | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/llvm/lib/CodeGen/CodeGenPrepare.cpp b/llvm/lib/CodeGen/CodeGenPrepare.cpp
index 74169fb885c93c..271a047fc6a7b8 100644
--- a/llvm/lib/CodeGen/CodeGenPrepare.cpp
+++ b/llvm/lib/CodeGen/CodeGenPrepare.cpp
@@ -2836,6 +2836,8 @@ bool CodeGenPrepare::dupRetToEnableTailCallOpts(BasicBlock *BB,
/// call.
const Function *F = BB->getParent();
SmallVector<BasicBlock *, 4> TailCallBBs;
+ // Record the call instructions so we can insert any fake uses
+ // that need to be preserved before them.
SmallVector<CallInst *, 4> CallInsts;
if (PN) {
for (unsigned I = 0, E = PN->getNumIncomingValues(); I != E; ++I) {
@@ -2848,6 +2850,7 @@ bool CodeGenPrepare::dupRetToEnableTailCallOpts(BasicBlock *BB,
TLI->mayBeEmittedAsTailCall(CI) &&
attributesPermitTailCall(F, CI, RetI, *TLI)) {
TailCallBBs.push_back(PredBB);
+ CallInsts.push_back(CI);
} else {
// Consider the cases in which the phi value is indirectly produced by
// the tail call, for example when encountering memset(), memmove(),
@@ -2867,12 +2870,11 @@ bool CodeGenPrepare::dupRetToEnableTailCallOpts(BasicBlock *BB,
isIntrinsicOrLFToBeTailCalled(TLInfo, CI) &&
IncomingVal == CI->getArgOperand(0) &&
TLI->mayBeEmittedAsTailCall(CI) &&
- attributesPermitTailCall(F, CI, RetI, *TLI))
+ attributesPermitTailCall(F, CI, RetI, *TLI)) {
TailCallBBs.push_back(PredBB);
+ CallInsts.push_back(CI);
+ }
}
- // Record the call instruction so we can insert any fake uses
- // that need to be preserved before it.
- CallInsts.push_back(CI);
}
} else {
SmallPtrSet<BasicBlock *, 4> VisitedBBs;
@@ -2889,8 +2891,6 @@ bool CodeGenPrepare::dupRetToEnableTailCallOpts(BasicBlock *BB,
(isIntrinsicOrLFToBeTailCalled(TLInfo, CI) &&
V == CI->getArgOperand(0))) {
TailCallBBs.push_back(Pred);
- // Record the call instruction so we can insert any fake uses
- // that need to be preserved before it.
CallInsts.push_back(CI);
}
}
>From b7df73b7852233a6344b6940c635e1943f42674e Mon Sep 17 00:00:00 2001
From: Stephen Tozer <stephen.tozer at sony.com>
Date: Wed, 28 Aug 2024 16:49:36 +0100
Subject: [PATCH 19/20] Fix fake-use tests
---
.../DebugInfo/AArch64/fake-use-global-isel.ll | 98 +++++++++++++++++++
.../{X86 => }/Inputs/check-fake-use.py | 7 +-
llvm/test/DebugInfo/X86/fake-use.ll | 17 +---
3 files changed, 108 insertions(+), 14 deletions(-)
create mode 100644 llvm/test/DebugInfo/AArch64/fake-use-global-isel.ll
rename llvm/test/DebugInfo/{X86 => }/Inputs/check-fake-use.py (92%)
diff --git a/llvm/test/DebugInfo/AArch64/fake-use-global-isel.ll b/llvm/test/DebugInfo/AArch64/fake-use-global-isel.ll
new file mode 100644
index 00000000000000..65a64583096959
--- /dev/null
+++ b/llvm/test/DebugInfo/AArch64/fake-use-global-isel.ll
@@ -0,0 +1,98 @@
+; REQUIRES: object-emission
+
+; Make sure the fake use of 'b' at the end of 'foo' causes location information for 'b'
+; to extend all the way to the end of the function.
+; Duplicates `DebugInfo/X86/fake-use.ll` for global-isel.
+
+; RUN: %llc_dwarf -O2 --global-isel=1 -mtriple=aarch64--linux-gnu -filetype=obj -dwarf-linkage-names=Abstract < %s | llvm-dwarfdump --debug-info --debug-line -v - -o %t
+; RUN: %python %p/../Inputs/check-fake-use.py %t
+; RUN: sed -e 's,call void (...) @llvm.fake.use,;,' %s \
+; RUN: | %llc_dwarf - -O2 --global-isel=1 -mtriple=aarch64--linux-gnu -filetype=obj -dwarf-linkage-names=Abstract \
+; RUN: | llvm-dwarfdump --debug-info --debug-line -v - -o %t
+; RUN: not %python %p/../Inputs/check-fake-use.py %t
+
+; Generated with:
+; clang -O2 -g -S -emit-llvm -fextend-this-ptr fake-use.c
+;
+; int glob[10];
+; extern void bar();
+;
+; int foo(int b, int i)
+; {
+; int loc = glob[i] * 2;
+; if (b) {
+; glob[2] = loc;
+; bar();
+; }
+; return loc;
+; }
+;
+; ModuleID = 't2.c'
+source_filename = "t2.c"
+
+ at glob = common local_unnamed_addr global [10 x i32] zeroinitializer, align 16, !dbg !0
+
+; Function Attrs: nounwind sspstrong uwtable
+define i32 @foo(i32 %b, i32 %i) local_unnamed_addr optdebug !dbg !13 {
+entry:
+ #dbg_value(i32 %b, !17, !20, !21)
+ %c = add i32 %b, 42
+ %tobool = icmp sgt i32 %c, 2, !dbg !27
+ tail call void (...) @bar() #2, !dbg !32
+ %idxprom = sext i32 %i to i64, !dbg !22
+ %arrayidx = getelementptr inbounds [10 x i32], [10 x i32]* @glob, i64 0, i64 %idxprom, !dbg !22
+ %0 = load i32, i32* %arrayidx, align 4, !dbg !22, !tbaa !23
+ %mul = shl nsw i32 %0, 1, !dbg !22
+ br i1 %tobool, label %if.end, label %if.then, !dbg !29
+
+if.then: ; preds = %entry
+ store i32 %mul, i32* getelementptr inbounds ([10 x i32], [10 x i32]* @glob, i64 0, i64 2), align 8, !dbg !30, !tbaa !23
+ tail call void (...) @bar() #2, !dbg !32
+ br label %if.end, !dbg !33
+
+if.end: ; preds = %entry, %if.then
+ call void (...) @llvm.fake.use(i32 %b), !dbg !34
+ ret i32 %mul, !dbg !35
+}
+
+declare void @bar(...) local_unnamed_addr
+
+!llvm.dbg.cu = !{!1}
+!llvm.module.flags = !{!9, !10, !11}
+!llvm.ident = !{!12}
+
+!0 = distinct !DIGlobalVariableExpression(var: !DIGlobalVariable(name: "glob", scope: !1, file: !2, line: 1, type: !5, isLocal: false, isDefinition: true), expr: !DIExpression())
+!1 = distinct !DICompileUnit(language: DW_LANG_C99, file: !2, producer: "clang version 4.0.0", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, enums: !3, globals: !4)
+!2 = !DIFile(filename: "t2.c", directory: "/")
+!3 = !{}
+!4 = !{!0}
+!5 = !DICompositeType(tag: DW_TAG_array_type, baseType: !6, size: 320, align: 32, elements: !7)
+!6 = !DIBasicType(name: "int", size: 32, align: 32, encoding: DW_ATE_signed)
+!7 = !{!8}
+!8 = !DISubrange(count: 10)
+!9 = !{i32 2, !"Dwarf Version", i32 4}
+!10 = !{i32 2, !"Debug Info Version", i32 3}
+!11 = !{i32 1, !"PIC Level", i32 2}
+!12 = !{!"clang version 4.0.0"}
+!13 = distinct !DISubprogram(name: "foo", scope: !2, file: !2, line: 4, type: !14, isLocal: false, isDefinition: true, scopeLine: 5, flags: DIFlagPrototyped, isOptimized: true, unit: !1, retainedNodes: !16)
+!14 = !DISubroutineType(types: !15)
+!15 = !{!6, !6, !6}
+!16 = !{!17, !19}
+!17 = !DILocalVariable(name: "b", arg: 1, scope: !13, file: !2, line: 4, type: !6)
+!19 = !DILocalVariable(name: "loc", scope: !13, file: !2, line: 6, type: !6)
+!20 = !DIExpression()
+!21 = !DILocation(line: 4, scope: !13)
+!22 = !DILocation(line: 6, scope: !13)
+!23 = !{!24, !24, i64 0}
+!24 = !{!"int", !25, i64 0}
+!25 = !{!"omnipotent char", !26, i64 0}
+!26 = !{!"Simple C/C++ TBAA"}
+!27 = !DILocation(line: 7, scope: !28)
+!28 = distinct !DILexicalBlock(scope: !13, file: !2, line: 7)
+!29 = !DILocation(line: 7, scope: !13)
+!30 = !DILocation(line: 8, scope: !31)
+!31 = distinct !DILexicalBlock(scope: !28, file: !2, line: 7)
+!32 = !DILocation(line: 9, scope: !31)
+!33 = !DILocation(line: 10, scope: !31)
+!34 = !DILocation(line: 12, scope: !13)
+!35 = !DILocation(line: 11, scope: !13)
diff --git a/llvm/test/DebugInfo/X86/Inputs/check-fake-use.py b/llvm/test/DebugInfo/Inputs/check-fake-use.py
similarity index 92%
rename from llvm/test/DebugInfo/X86/Inputs/check-fake-use.py
rename to llvm/test/DebugInfo/Inputs/check-fake-use.py
index d57d02281fe331..7797e102419b24 100644
--- a/llvm/test/DebugInfo/X86/Inputs/check-fake-use.py
+++ b/llvm/test/DebugInfo/Inputs/check-fake-use.py
@@ -14,7 +14,7 @@
EpiloguePattern = r"^\s*0x([0-9a-f]+)\s.+epilogue_begin"
FormalPattern = r"^0x[0-9a-f]+:\s+DW_TAG_formal_parameter"
LocationPattern = r"DW_AT_location\s+\[DW_FORM_([a-z_]+)\](?:.*0x([a-f0-9]+))"
-DebugLocPattern = r'\[0x([a-f0-9]+),\s+0x([a-f0-9]+)\) ".text":'
+DebugLocPattern = r'\[0x([a-f0-9]+),\s+0x([a-f0-9]+)\) ".text": (.+)$'
SeenDebugInfo = False
SeenDebugLine = False
@@ -64,10 +64,15 @@
) is not None:
match_loc_start = int(debug_loc_match.group(1), 16)
match_loc_end = int(debug_loc_match.group(2), 16)
+ match_expr = debug_loc_match.group(3)
if match_loc_start != LocationRanges[1]:
raise RuntimeError(
f"Location list for 'b' is discontinuous from [0x{LocationRanges[1]:x}, 0x{match_loc_start:x})"
)
+ if "stack_value" in match_expr:
+ raise RuntimeError(
+ f"Location list for 'b' contains a stack_value expression: {match_expr}"
+ )
LocationRanges = (LocationRanges[0], match_loc_end)
# Get the prologue_end address.
elif PrologueEnd is None:
diff --git a/llvm/test/DebugInfo/X86/fake-use.ll b/llvm/test/DebugInfo/X86/fake-use.ll
index e1357d5bfb1364..f44aadfeef5640 100644
--- a/llvm/test/DebugInfo/X86/fake-use.ll
+++ b/llvm/test/DebugInfo/X86/fake-use.ll
@@ -2,20 +2,11 @@
; Make sure the fake use of 'b' at the end of 'foo' causes location information for 'b'
; to extend all the way to the end of the function.
-; FIXME: We need to use `-debugger-tune=sce` to disable entry values, since otherwise
-; %b will be preserved as one; there doesn't seem to be another way to disable entry
-; values (--debug-entry-values is only used to force-enable them, not disable them).
-; RUN: %llc_dwarf -O2 -debugger-tune=sce -filetype=obj -dwarf-linkage-names=Abstract < %s | llvm-dwarfdump --debug-info --debug-line -v - -o %t
-; RUN: %python %p/Inputs/check-fake-use.py %t
-; RUN: %llc_dwarf -O2 -debugger-tune=sce --global-isel=1 -mtriple=aarch64--linux-gnu -filetype=obj -dwarf-linkage-names=Abstract < %s | llvm-dwarfdump --debug-info --debug-line -v - -o %t
-; RUN: %python %p/Inputs/check-fake-use.py %t
-; RUN: sed -e 's,call void (...) @llvm.fake.use,;,' %s | %llc_dwarf - -O2 -debugger-tune=sce -filetype=obj -dwarf-linkage-names=Abstract | llvm-dwarfdump --debug-info --debug-line -v - -o %t
-; RUN: not %python %p/Inputs/check-fake-use.py %t
-; RUN: sed -e 's,call void (...) @llvm.fake.use,;,' %s \
-; RUN: | %llc_dwarf - -O2 -debugger-tune=sce --global-isel=1 -mtriple=aarch64--linux-gnu -filetype=obj -dwarf-linkage-names=Abstract \
-; RUN: | llvm-dwarfdump --debug-info --debug-line -v - -o %t
-; RUN: not %python %p/Inputs/check-fake-use.py %t
+; RUN: %llc_dwarf -O2 -filetype=obj -dwarf-linkage-names=Abstract < %s | llvm-dwarfdump --debug-info --debug-line -v - -o %t
+; RUN: %python %p/../Inputs/check-fake-use.py %t
+; RUN: sed -e 's,call void (...) @llvm.fake.use,;,' %s | %llc_dwarf - -O2 -filetype=obj -dwarf-linkage-names=Abstract | llvm-dwarfdump --debug-info --debug-line -v - -o %t
+; RUN: not %python %p/../Inputs/check-fake-use.py %t
; Generated with:
; clang -O2 -g -S -emit-llvm -fextend-this-ptr fake-use.c
>From 2bca9559b0d9ae4cd6ef0cbf269ad830473bd4d7 Mon Sep 17 00:00:00 2001
From: Stephen Tozer <stephen.tozer at sony.com>
Date: Thu, 29 Aug 2024 11:14:43 +0100
Subject: [PATCH 20/20] clang-format
---
llvm/lib/CodeGen/MachineCSE.cpp | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/llvm/lib/CodeGen/MachineCSE.cpp b/llvm/lib/CodeGen/MachineCSE.cpp
index 5ce947e807cf9d..aadc54b495fe22 100644
--- a/llvm/lib/CodeGen/MachineCSE.cpp
+++ b/llvm/lib/CodeGen/MachineCSE.cpp
@@ -406,7 +406,8 @@ bool MachineCSE::PhysRegDefsReach(MachineInstr *CSMI, MachineInstr *MI,
bool MachineCSE::isCSECandidate(MachineInstr *MI) {
if (MI->isPosition() || MI->isPHI() || MI->isImplicitDef() || MI->isKill() ||
- MI->isInlineAsm() || MI->isDebugInstr() || MI->isJumpTableDebugInfo() || MI->isFakeUse())
+ MI->isInlineAsm() || MI->isDebugInstr() || MI->isJumpTableDebugInfo() ||
+ MI->isFakeUse())
return false;
// Ignore copies.
More information about the llvm-commits
mailing list