[PATCHES] R600/SI: Small VI improvements
Marek Olšák
maraeo at gmail.com
Fri Mar 6 09:31:04 PST 2015
Thanks Tom for fixing the scheduler weirdness. The attached patch only
adds the fix for the elf test.
Marek
On Fri, Mar 6, 2015 at 2:45 PM, Marek Olšák <maraeo at gmail.com> wrote:
> Well, the patch breaks a lot of tests due to some instructions being
> scheduled differently for some reason. I'm fixing the tests.
> Apparently, the reserved registers have an effect on the scheduler.
>
> Marek
>
> On Fri, Mar 6, 2015 at 1:24 PM, Marek Olšák <maraeo at gmail.com> wrote:
>> An updated patch is attached. Please review.
>>
>> Marek
>>
>> On Wed, Mar 4, 2015 at 8:25 PM, Tom Stellard <tom at stellard.net> wrote:
>>> On Wed, Mar 04, 2015 at 06:41:19PM +0100, Marek Olšák wrote:
>>>> Please review.
>>>>
>>>> I'm not sure how important the second patch is.
>>>>
>>>> Marek
>>>
>>>> From c89e3bcc8475b3519967e1b34187164a20080250 Mon Sep 17 00:00:00 2001
>>>> From: =?UTF-8?q?Marek=20Ol=C5=A1=C3=A1k?= <marek.olsak at amd.com>
>>>> Date: Wed, 4 Mar 2015 15:40:53 +0100
>>>> Subject: [PATCH 1/2] R600/SI: Limit SGPRs to 80 on Tonga and Iceland
>>>>
>>>> This is a candidate for stable.
>>>> ---
>>>> lib/Target/R600/SIRegisterInfo.cpp | 8 ++++++++
>>>> 1 file changed, 8 insertions(+)
>>>>
>>>> diff --git a/lib/Target/R600/SIRegisterInfo.cpp b/lib/Target/R600/SIRegisterInfo.cpp
>>>> index e2138d2..4b9bee3 100644
>>>> --- a/lib/Target/R600/SIRegisterInfo.cpp
>>>> +++ b/lib/Target/R600/SIRegisterInfo.cpp
>>>> @@ -47,6 +47,14 @@ BitVector SIRegisterInfo::getReservedRegs(const MachineFunction &MF) const {
>>>> Reserved.set(AMDGPU::VGPR255);
>>>> Reserved.set(AMDGPU::VGPR254);
>>>>
>>>> + // Tonga and Iceland can only allocate 80 SGPRs due to a hw bug.
>>>> + // That's 74 SGPRs if all XNACK_MASK, FLAT_SCRATCH, and VCC are used.
>>>> + // For now, assume XNACK_MASK is unused.
>>>
>>> This should be added as a subtarget feature in AMDGPU.td / AMDGPUSubtarget.h
>>> and applied to these GPUs in Processors.td
>>>
>>>> + StringRef Cpu = ST.getTargetLowering()->getTargetMachine().getTargetCPU();
>>>> + if (Cpu == "tonga" || Cpu == "iceland")
>>>> + for (int i = AMDGPU::SGPR76; i <= AMDGPU::SGPR101; i++)
>>>> + Reserved.set(i);
>>>> +
>>>
>>> You also need to reserve super registers. Something like:
>>>
>>> for (unsigned i = 76, e = AMDGPU::SGPR_32RegClass.getReg(i); i !=e; ++i) {
>>> for (MCRegAliasIterator R = MCRegAliasIterator(i, this, true); R.isValid() ++R) {
>>> Reserved.set(*R);
>>> }
>>> }
>>>
>>> You should also update AMDGPUAsmPrinter.cpp to always report at least 80 SGPRs
>>> used.
>>>
>>>> return Reserved;
>>>> }
>>>>
>>>> --
>>>> 2.1.0
>>>>
>>>
>>>> From 932eaffc444fa581008c5039eaa63a6727beb534 Mon Sep 17 00:00:00 2001
>>>> From: =?UTF-8?q?Marek=20Ol=C5=A1=C3=A1k?= <marek.olsak at amd.com>
>>>> Date: Wed, 4 Mar 2015 17:59:50 +0100
>>>> Subject: [PATCH 2/2] R600/SI: Fix getNumSGPRsAllowed for VI
>>>>
>>>
>>> This function is only used by the scheduler and is important for getting
>>> the best performance.
>>>
>>> LGTM.
>>>
>>>> ---
>>>> lib/Target/R600/SIRegisterInfo.cpp | 32 +++++++++++++++++++++-----------
>>>> lib/Target/R600/SIRegisterInfo.h | 4 +++-
>>>> 2 files changed, 24 insertions(+), 12 deletions(-)
>>>>
>>>> diff --git a/lib/Target/R600/SIRegisterInfo.cpp b/lib/Target/R600/SIRegisterInfo.cpp
>>>> index 4b9bee3..14413e9 100644
>>>> --- a/lib/Target/R600/SIRegisterInfo.cpp
>>>> +++ b/lib/Target/R600/SIRegisterInfo.cpp
>>>> @@ -14,7 +14,6 @@
>>>>
>>>>
>>>> #include "SIRegisterInfo.h"
>>>> -#include "AMDGPUSubtarget.h"
>>>> #include "SIInstrInfo.h"
>>>> #include "SIMachineFunctionInfo.h"
>>>> #include "llvm/CodeGen/MachineFrameInfo.h"
>>>> @@ -61,7 +60,8 @@ BitVector SIRegisterInfo::getReservedRegs(const MachineFunction &MF) const {
>>>> unsigned SIRegisterInfo::getRegPressureSetLimit(unsigned Idx) const {
>>>>
>>>> // FIXME: We should adjust the max number of waves based on LDS size.
>>>> - unsigned SGPRLimit = getNumSGPRsAllowed(ST.getMaxWavesPerCU());
>>>> + unsigned SGPRLimit = getNumSGPRsAllowed(ST.getGeneration(),
>>>> + ST.getMaxWavesPerCU());
>>>> unsigned VGPRLimit = getNumVGPRsAllowed(ST.getMaxWavesPerCU());
>>>>
>>>> for (regclass_iterator I = regclass_begin(), E = regclass_end();
>>>> @@ -502,14 +502,24 @@ unsigned SIRegisterInfo::getNumVGPRsAllowed(unsigned WaveCount) const {
>>>> }
>>>> }
>>>>
>>>> -unsigned SIRegisterInfo::getNumSGPRsAllowed(unsigned WaveCount) const {
>>>> - switch(WaveCount) {
>>>> - case 10: return 48;
>>>> - case 9: return 56;
>>>> - case 8: return 64;
>>>> - case 7: return 72;
>>>> - case 6: return 80;
>>>> - case 5: return 96;
>>>> - default: return 103;
>>>> +unsigned SIRegisterInfo::getNumSGPRsAllowed(AMDGPUSubtarget::Generation gen,
>>>> + unsigned WaveCount) const {
>>>> + if (gen >= AMDGPUSubtarget::VOLCANIC_ISLANDS) {
>>>> + switch (WaveCount) {
>>>> + case 10: return 80;
>>>> + case 9: return 80;
>>>> + case 8: return 96;
>>>> + default: return 102;
>>>> + }
>>>> + } else {
>>>> + switch(WaveCount) {
>>>> + case 10: return 48;
>>>> + case 9: return 56;
>>>> + case 8: return 64;
>>>> + case 7: return 72;
>>>> + case 6: return 80;
>>>> + case 5: return 96;
>>>> + default: return 103;
>>>> + }
>>>> }
>>>> }
>>>> diff --git a/lib/Target/R600/SIRegisterInfo.h b/lib/Target/R600/SIRegisterInfo.h
>>>> index d908ffd..1dfe530 100644
>>>> --- a/lib/Target/R600/SIRegisterInfo.h
>>>> +++ b/lib/Target/R600/SIRegisterInfo.h
>>>> @@ -17,6 +17,7 @@
>>>> #define LLVM_LIB_TARGET_R600_SIREGISTERINFO_H
>>>>
>>>> #include "AMDGPURegisterInfo.h"
>>>> +#include "AMDGPUSubtarget.h"
>>>> #include "llvm/Support/Debug.h"
>>>>
>>>> namespace llvm {
>>>> @@ -111,7 +112,8 @@ struct SIRegisterInfo : public AMDGPURegisterInfo {
>>>>
>>>> /// \brief Give the maximum number of SGPRs that can be used by \p WaveCount
>>>> /// concurrent waves.
>>>> - unsigned getNumSGPRsAllowed(unsigned WaveCount) const;
>>>> + unsigned getNumSGPRsAllowed(AMDGPUSubtarget::Generation gen,
>>>> + unsigned WaveCount) const;
>>>>
>>>> unsigned findUnusedRegister(const MachineRegisterInfo &MRI,
>>>> const TargetRegisterClass *RC) const;
>>>> --
>>>> 2.1.0
>>>>
>>>
>>>> _______________________________________________
>>>> llvm-commits mailing list
>>>> llvm-commits at cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-R600-SI-Limit-SGPRs-to-80-on-Tonga-and-Iceland.patch
Type: text/x-patch
Size: 7045 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150306/a4b527b9/attachment.bin>
More information about the llvm-commits
mailing list