[PATCH] R600/SI: Insert more NOPs after READLANE on VI, don't use NOPs on CI

Marek Olšák maraeo at gmail.com
Sat Mar 21 04:22:07 PDT 2015


Ping

On Mon, Mar 16, 2015 at 11:56 PM, Marek Olšák <maraeo at gmail.com> wrote:
> From: Marek Olšák <marek.olsak at amd.com>
>
> This is a candidate for stable.
> ---
>  lib/Target/R600/SIRegisterInfo.cpp | 17 ++++++++++++++++-
>  1 file changed, 16 insertions(+), 1 deletion(-)
>
> diff --git a/lib/Target/R600/SIRegisterInfo.cpp b/lib/Target/R600/SIRegisterInfo.cpp
> index 6030ce8..13a8974 100644
> --- a/lib/Target/R600/SIRegisterInfo.cpp
> +++ b/lib/Target/R600/SIRegisterInfo.cpp
> @@ -268,7 +268,22 @@ void SIRegisterInfo::eliminateFrameIndex(MachineBasicBlock::iterator MI,
>                    .addReg(SubReg);
>          }
>        }
> -      TII->insertNOPs(MI, 3);
> +
> +      // TODO: only do this when it is needed
> +      switch (MF->getSubtarget<AMDGPUSubtarget>().getGeneration()) {
> +      case AMDGPUSubtarget::SOUTHERN_ISLANDS:
> +        // "VALU writes SGPR" -> "SMRD reads that SGPR" needs "S_NOP 3" on SI
> +        TII->insertNOPs(MI, 3);
> +        break;
> +      case AMDGPUSubtarget::SEA_ISLANDS:
> +        break;
> +      default: // VOLCANIC_ISLANDS and later
> +        // "VALU writes SGPR -> VMEM reads that SGPR" needs "S_NOP 4" on VI
> +        // and later. This also applies to VALUs which write VCC, but we're
> +        // unlikely to see VMEM use VCC.
> +        TII->insertNOPs(MI, 4);
> +      }
> +
>        MI->eraseFromParent();
>        break;
>      }
> --
> 2.1.0
>




More information about the llvm-commits mailing list