[PATCH] R600/SI: Insert more NOPs after READLANE on VI, don't use NOPs on CI
Marek Olšák
maraeo at gmail.com
Sat Mar 21 04:22:07 PDT 2015
Ping
On Mon, Mar 16, 2015 at 11:56 PM, Marek Olšák <maraeo at gmail.com> wrote:
> From: Marek Olšák <marek.olsak at amd.com>
>
> This is a candidate for stable.
> ---
> lib/Target/R600/SIRegisterInfo.cpp | 17 ++++++++++++++++-
> 1 file changed, 16 insertions(+), 1 deletion(-)
>
> diff --git a/lib/Target/R600/SIRegisterInfo.cpp b/lib/Target/R600/SIRegisterInfo.cpp
> index 6030ce8..13a8974 100644
> --- a/lib/Target/R600/SIRegisterInfo.cpp
> +++ b/lib/Target/R600/SIRegisterInfo.cpp
> @@ -268,7 +268,22 @@ void SIRegisterInfo::eliminateFrameIndex(MachineBasicBlock::iterator MI,
> .addReg(SubReg);
> }
> }
> - TII->insertNOPs(MI, 3);
> +
> + // TODO: only do this when it is needed
> + switch (MF->getSubtarget<AMDGPUSubtarget>().getGeneration()) {
> + case AMDGPUSubtarget::SOUTHERN_ISLANDS:
> + // "VALU writes SGPR" -> "SMRD reads that SGPR" needs "S_NOP 3" on SI
> + TII->insertNOPs(MI, 3);
> + break;
> + case AMDGPUSubtarget::SEA_ISLANDS:
> + break;
> + default: // VOLCANIC_ISLANDS and later
> + // "VALU writes SGPR -> VMEM reads that SGPR" needs "S_NOP 4" on VI
> + // and later. This also applies to VALUs which write VCC, but we're
> + // unlikely to see VMEM use VCC.
> + TII->insertNOPs(MI, 4);
> + }
> +
> MI->eraseFromParent();
> break;
> }
> --
> 2.1.0
>
More information about the llvm-commits
mailing list