[PATCHES] R600/SI: VI fixes

Tue Jan 13 02:56:23 PST 2015

On Mon, Jan 12, 2015 at 10:32 PM, Tom Stellard <tom at stellard.net> wrote:
> On Mon, Dec 15, 2014 at 01:53:19PM +0100, Marek Olšák wrote:
>> An updated version is attached. I fixed the shrinking pass for opcodes
>> that don't have e32 encoding (patch 3). I'm not sure if patch 2 is a
>> good idea, I'd appreciate some opinion on that.
>>
>> The MIN3/MAX3 patch is not included here. Still don't know what aspect
>> of those opcodes should be tested. Also, I can't re-use the current
>> tests, because the store opcode that the tests use isn't implemented
>> for VI.
>>
>> Marek
>>
>> On Sun, Dec 14, 2014 at 11:02 PM, Marek Olšák <maraeo at gmail.com> wrote:
>> > The current pseudos define everything except for the encoding.
>> >
>> > The 5 attached patches replace patch 1 in the previous series.
>> > This new series unifies other VOP2 opcodes which are only available as
>> > VOP3 on VI.
>> >
>> > Marek
>> >
>> > On Sat, Dec 13, 2014 at 7:20 PM, Matt Arsenault <arsenm2 at gmail.com> wrote:
>> >>
>> >>> On Dec 13, 2014, at 5:12 AM, Marek Olšák <maraeo at gmail.com> wrote:
>> >>>
>> >>> On Fri, Dec 12, 2014 at 11:57 PM, Matt Arsenault <arsenm2 at gmail.com> wrote:
>> >>>>
>> >>>>> On Dec 12, 2014, at 4:41 PM, Marek Olšák <maraeo at gmail.com> wrote:
>> >>>>>
>> >>>>> Please review.
>> >>>>>
>> >>>>> Marek
>> >>>>
>> >>>> #1 - Why can’t this use the same encoding mapping trick the other instructions do so everything except encoding doesn’t need to see separate opcodes?
>> >>>
>> >>> The instructions are VOP2 on SI and VOP3 on VI. I can't define common
>> >>> pseudo instructions, because it's not clear whether I should set the
>> >>> VOP2 bit or the VOP3 bit or both or none.
>> >>>
>> >>> Marek
>> >>
>> >> I would expect pseudos to have neither set, but I’m not sure what the current pseudos do
>
>> From fe119184222cc6e48fc6f1726d140dd13014e009 Mon Sep 17 00:00:00 2001
>> From: =?UTF-8?q?Marek=20Ol=C5=A1=C3=A1k?= <marek.olsak at amd.com>
>> Date: Sun, 14 Dec 2014 15:11:23 +0100
>> Subject: [PATCH 1/6] R600/SI: Add common class VOPAnyCommon
>>
>
> LGTM.
>
>> From 57e9b82fd65ae1b3b537932b3ac961163ecd2bd9 Mon Sep 17 00:00:00 2001
>> From: =?UTF-8?q?Marek=20Ol=C5=A1=C3=A1k?= <marek.olsak at amd.com>
>> Date: Sun, 14 Dec 2014 16:08:20 +0100
>> Subject: [PATCH 2/6] R600/SI: Remove multiclass VOP1InstSI
>>
>> VOP1Inst also defines a VI opcode, but the predicate should prevent it from
>> getting selected (right?), so the behavior should be the same.
>
> Yes.
>
> LGTM.
>
>> From e9a3b4bf87f6e9173dae0dc01beccd0278d59393 Mon Sep 17 00:00:00 2001
>> From: =?UTF-8?q?Marek=20Ol=C5=A1=C3=A1k?= <marek.olsak at amd.com>
>> Date: Mon, 15 Dec 2014 13:09:36 +0100
>> Subject: [PATCH 3/6] R600/SI: Don't shrink instructions whose e32 encoding
>>  doesn't exist
>>
>> ---
>>  lib/Target/R600/AMDGPUMCInstLower.cpp    | 16 +++++++---------
>>  lib/Target/R600/AMDGPUMCInstLower.h      |  7 +++----
>>  lib/Target/R600/SIShrinkInstructions.cpp |  6 ++++++
>>  3 files changed, 16 insertions(+), 13 deletions(-)
>>
>> diff --git a/lib/Target/R600/AMDGPUMCInstLower.cpp b/lib/Target/R600/AMDGPUMCInstLower.cpp
>> index 2dcebf5..554c534 100644
>> --- a/lib/Target/R600/AMDGPUMCInstLower.cpp
>> +++ b/lib/Target/R600/AMDGPUMCInstLower.cpp
>> @@ -40,7 +40,7 @@ AMDGPUMCInstLower::AMDGPUMCInstLower(MCContext &ctx, const AMDGPUSubtarget &st):
>>  { }
>>
>>  enum AMDGPUMCInstLower::SISubtarget
>> -AMDGPUMCInstLower::AMDGPUSubtargetToSISubtarget(unsigned Gen) const {
>> +AMDGPUMCInstLower::AMDGPUSubtargetToSISubtarget(unsigned Gen) {
>
> Why was 'const' removed ?

The function was made static, which doesn't allow const.

>
>>    switch (Gen) {
>>    default:
>>      return AMDGPUMCInstLower::SI;
>> @@ -49,19 +49,17 @@ AMDGPUMCInstLower::AMDGPUSubtargetToSISubtarget(unsigned Gen) const {
>>    }
>>  }
>>
>> -unsigned AMDGPUMCInstLower::getMCOpcode(unsigned MIOpcode) const {
>> +int AMDGPUMCInstLower::getMCOpcode(unsigned MIOpcode, unsigned Gen) {
>
> Same question here.

The function was made static, which doesn't allow const.

>
>>
>> -  int MCOpcode = AMDGPU::getMCOpcode(MIOpcode,
>> -                              AMDGPUSubtargetToSISubtarget(ST.getGeneration()));
>> -  if (MCOpcode == -1)
>> -    MCOpcode = MIOpcode;
>> -
>> -  return MCOpcode;
>> +  return AMDGPU::getMCOpcode(MIOpcode, AMDGPUSubtargetToSISubtarget(Gen));
>>  }
>>
>>  void AMDGPUMCInstLower::lower(const MachineInstr *MI, MCInst &OutMI) const {
>>
>> -  OutMI.setOpcode(getMCOpcode(MI->getOpcode()));
>> +  int MIOpcode = MI->getOpcode();
>> +  int MCOpcode = getMCOpcode(MIOpcode, ST.getGeneration());
>> +
>> +  OutMI.setOpcode(MCOpcode != -1 ? MCOpcode : MIOpcode);
>>
>>    for (const MachineOperand &MO : MI->explicit_operands()) {
>>      MCOperand MCOp;
>> diff --git a/lib/Target/R600/AMDGPUMCInstLower.h b/lib/Target/R600/AMDGPUMCInstLower.h
>> index 0ae4d11..56dc3e3 100644
>> --- a/lib/Target/R600/AMDGPUMCInstLower.h
>> +++ b/lib/Target/R600/AMDGPUMCInstLower.h
>> @@ -31,10 +31,7 @@ class AMDGPUMCInstLower {
>>
>>    /// Convert a member of the AMDGPUSubtarget::Generation enum to the
>>    /// SISubtarget enum.
>> -  enum SISubtarget AMDGPUSubtargetToSISubtarget(unsigned Gen) const;
>> -
>> -  /// Get the MC opcode for this MachineInstr.
>> -  unsigned getMCOpcode(unsigned MIOpcode) const;
>> +  static enum SISubtarget AMDGPUSubtargetToSISubtarget(unsigned Gen);
>>
>>  public:
>>    AMDGPUMCInstLower(MCContext &ctx, const AMDGPUSubtarget &ST);
>> @@ -42,6 +39,8 @@ public:
>>    /// \brief Lower a MachineInstr to an MCInst
>>    void lower(const MachineInstr *MI, MCInst &OutMI) const;
>>
>> +  /// Get the MC opcode for this MachineInstr.
>> +  static int getMCOpcode(unsigned MIOpcode, unsigned Gen);
>>  };
>>
>>  } // End namespace llvm
>> diff --git a/lib/Target/R600/SIShrinkInstructions.cpp b/lib/Target/R600/SIShrinkInstructions.cpp
>> index e97f934..20632d2 100644
>> --- a/lib/Target/R600/SIShrinkInstructions.cpp
>> +++ b/lib/Target/R600/SIShrinkInstructions.cpp
>> @@ -10,6 +10,7 @@
>>  //
>>
>>  #include "AMDGPU.h"
>> +#include "AMDGPUMCInstLower.h"
>>  #include "AMDGPUSubtarget.h"
>>  #include "SIInstrInfo.h"
>>  #include "llvm/ADT/Statistic.h"
>> @@ -220,6 +221,11 @@ bool SIShrinkInstructions::runOnMachineFunction(MachineFunction &MF) {
>>        if (Op32 == -1)
>>          continue;
>>
>> +      // Can't shrink an instruction if the e32 version doesn't exist
>> +      // for this GPU family.
>> +      if (AMDGPUMCInstLower::getMCOpcode(Op32, TRI.ST.getGeneration()) == 65535)
>> +        continue;
>> +
>
> We should wrap AMDGPU::getVOPe32() with a new function in SIInstrInfo that includes this
> check.  Would it also be possible to move the getMCOpcode function to SIInstrInfo?

I'll check.

>
>>        if (TII->isVOPC(Op32)) {
>>          unsigned DstReg = MI.getOperand(0).getReg();
>>          if (TargetRegisterInfo::isVirtualRegister(DstReg)) {
>> --
>> 2.1.0
>>
>
>> From d1d525333b691f456bc08a98dac2be33e15771f9 Mon Sep 17 00:00:00 2001
>> From: =?UTF-8?q?Marek=20Ol=C5=A1=C3=A1k?= <marek.olsak at amd.com>
>> Date: Fri, 12 Dec 2014 11:07:30 +0100
>> Subject: [PATCH 4/6] R600/SI: Add V_READLANE_B32 and V_WRITELANE_B32 for VI
>>
>> These are VOP3-only on VI.
>>
>> The new multiclass doesn't define VOP3 versions of VOP2 instructions.
>
> LGTM.
>
>> From aa6f2d3be38c60ab9b7709be9e208ecd284a61f8 Mon Sep 17 00:00:00 2001
>> From: =?UTF-8?q?Marek=20Ol=C5=A1=C3=A1k?= <marek.olsak at amd.com>
>> Date: Sun, 14 Dec 2014 22:26:00 +0100
>> Subject: [PATCH 5/6] R600/SI: Use 64-bit encoding by default for opcodes that
>>  are VOP3-only on VI
>>
>
> We should be doing this for all instructions anyway.  LGTM.
>
>> From 8ed81aab5e2320baa2c99b16850fc06572b21b63 Mon Sep 17 00:00:00 2001
>> From: =?UTF-8?q?Marek=20Ol=C5=A1=C3=A1k?= <marek.olsak at amd.com>
>> Date: Sun, 14 Dec 2014 19:27:17 +0100
>> Subject: [PATCH 6/6] R600/SI: Unify VOP2 instructions which are VOP3-only on
>>  VI
>>
>> This removes some duplicated classes and definitions.
>>
>> These instructions are defined:
>>   _e32 // pseudo
>>   _e32_si
>>   _e64 // pseudo
>>   _e64_si
>>   _e64_vi
>>
>> I hope the shrinking pass doesn't operate on pseudo instructions.
>
> It is supposed to.  I need more time to review this patch.

You can ignore the comment about the shrinking pass. It's not relevant
after the commit "R600/SI: Don't shrink instructions whose e32
encoding doesn't exist", which ensures that if _32_vi isn't present,
it's not shrunk.

Marek