[PATCH] R600/SI: Also enable WQM for image opcodes which calculate LOD

Fri Jan 30 06:52:37 PST 2015

On Fri, Jan 30, 2015 at 10:55:38AM +0900, Michel Dänzer wrote:
> On 22.01.2015 00:30, Tom Stellard wrote:
> > On Wed, Jan 21, 2015 at 01:07:25PM +0900, Michel Dänzer wrote:
> >> From: Michel Dänzer <michel.daenzer at amd.com>
> >>
> >> If whole quad mode isn't enabled for these, the level of detail is
> >> calculated incorrectly for pixels along diagonal triangle edges, causing
> >> artifacts.
> >>
> >> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88642
> >> Signed-off-by: Michel Dänzer <michel.daenzer at amd.com>
> >> ---
> >>  lib/Target/R600/SILowerControlFlow.cpp | 55 ++++++++++++++++++++++++++++++++++
> >>  1 file changed, 55 insertions(+)
> >>
> >> diff --git a/lib/Target/R600/SILowerControlFlow.cpp b/lib/Target/R600/SILowerControlFlow.cpp
> >> index 068b22f..a468a18 100644
> >> --- a/lib/Target/R600/SILowerControlFlow.cpp
> >> +++ b/lib/Target/R600/SILowerControlFlow.cpp
> >> @@ -514,6 +514,61 @@ bool SILowerControlFlowPass::runOnMachineFunction(MachineFunction &MF) {
> >>            IndirectDst(MI);
> >>            break;
> >>  
> >> +#define MATCH_IMAGE(opcode)                   \
> >> +        case AMDGPU::IMAGE_##opcode##_V1_V1:  \
> >> +        case AMDGPU::IMAGE_##opcode##_V1_V2:  \
> >> +        case AMDGPU::IMAGE_##opcode##_V1_V4:  \
> >> +        case AMDGPU::IMAGE_##opcode##_V1_V8:  \
> >> +        case AMDGPU::IMAGE_##opcode##_V1_V16: \
> >> +        case AMDGPU::IMAGE_##opcode##_V2_V1:  \
> >> +        case AMDGPU::IMAGE_##opcode##_V2_V2:  \
> >> +        case AMDGPU::IMAGE_##opcode##_V2_V4:  \
> >> +        case AMDGPU::IMAGE_##opcode##_V2_V8:  \
> >> +        case AMDGPU::IMAGE_##opcode##_V2_V16: \
> >> +        case AMDGPU::IMAGE_##opcode##_V3_V1:  \
> >> +        case AMDGPU::IMAGE_##opcode##_V3_V2:  \
> >> +        case AMDGPU::IMAGE_##opcode##_V3_V4:  \
> >> +        case AMDGPU::IMAGE_##opcode##_V3_V8:  \
> >> +        case AMDGPU::IMAGE_##opcode##_V3_V16: \
> >> +        case AMDGPU::IMAGE_##opcode##_V4_V1:  \
> >> +        case AMDGPU::IMAGE_##opcode##_V4_V2:  \
> >> +        case AMDGPU::IMAGE_##opcode##_V4_V4:  \
> >> +        case AMDGPU::IMAGE_##opcode##_V4_V8:  \
> >> +        case AMDGPU::IMAGE_##opcode##_V4_V16
> >> +
> >> +        MATCH_IMAGE(GATHER4):
> >> +        MATCH_IMAGE(GATHER4_B):
> >> +        MATCH_IMAGE(GATHER4_B_CL):
> >> +        MATCH_IMAGE(GATHER4_B_CL_O):
> >> +        MATCH_IMAGE(GATHER4_B_O):
> >> +        MATCH_IMAGE(GATHER4_C):
> >> +        MATCH_IMAGE(GATHER4_C_B):
> >> +        MATCH_IMAGE(GATHER4_C_B_CL):
> >> +        MATCH_IMAGE(GATHER4_C_B_CL_O):
> >> +        MATCH_IMAGE(GATHER4_C_B_O):
> >> +        MATCH_IMAGE(GATHER4_C_CL):
> >> +        MATCH_IMAGE(GATHER4_C_CL_O):
> >> +        MATCH_IMAGE(GATHER4_C_O):
> >> +        MATCH_IMAGE(GATHER4_CL):
> >> +        MATCH_IMAGE(GATHER4_CL_O):
> >> +        MATCH_IMAGE(GATHER4_O):
> >> +        MATCH_IMAGE(GET_LOD):
> >> +        MATCH_IMAGE(SAMPLE):
> >> +        MATCH_IMAGE(SAMPLE_B):
> >> +        MATCH_IMAGE(SAMPLE_B_CL):
> >> +        MATCH_IMAGE(SAMPLE_B_CL_O):
> >> +        MATCH_IMAGE(SAMPLE_B_O):
> >> +        MATCH_IMAGE(SAMPLE_C):
> >> +        MATCH_IMAGE(SAMPLE_C_B):
> >> +        MATCH_IMAGE(SAMPLE_C_B_CL):
> >> +        MATCH_IMAGE(SAMPLE_C_B_CL_O):
> >> +        MATCH_IMAGE(SAMPLE_C_B_O):
> >> +        MATCH_IMAGE(SAMPLE_C_CL):
> >> +        MATCH_IMAGE(SAMPLE_C_CL_O):
> >> +        MATCH_IMAGE(SAMPLE_C_O):
> >> +        MATCH_IMAGE(SAMPLE_CL):
> >> +        MATCH_IMAGE(SAMPLE_CL_O):
> >> +        MATCH_IMAGE(SAMPLE_O):
> > 
> > Would it be possible to avoid this switch statement by adding a
> > new target flag to these tablegen definitions?
> 
> How about the attached v2 patch?
> 
> I'm also attaching another patch which drops enabling WQM for V_INTERP_*
> instructions.
> 
> 

Hi Michel,

These both look good to me.  Do you think the second one is a candidate
for the stable branch?

-Tom
> -- 
> Earthling Michel Dänzer               |               http://www.amd.com
> Libre software enthusiast             |             Mesa and X developer

> From f4fc5014e66c2b28427e8d3ae9ea10876f952279 Mon Sep 17 00:00:00 2001
> From: =?UTF-8?q?Michel=20D=C3=A4nzer?= <michel.daenzer at amd.com>
> Date: Wed, 21 Jan 2015 12:59:05 +0900
> Subject: [PATCH 1/2] R600/SI: Also enable WQM for image opcodes which
>  calculate LOD v2
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
> 
> If whole quad mode isn't enabled for these, the level of detail is
> calculated incorrectly for pixels along diagonal triangle edges, causing
> artifacts.
> 
> v2: Use a TSFlag instead of lots of switch cases
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88642
> Signed-off-by: Michel D??nzer <michel.daenzer at amd.com>
> ---
>  lib/Target/R600/SIDefines.h            |  3 +-
>  lib/Target/R600/SIInstrFormats.td      |  2 ++
>  lib/Target/R600/SIInstrInfo.h          |  4 +++
>  lib/Target/R600/SIInstrInfo.td         | 60 +++++++++++++++++++------------
>  lib/Target/R600/SIInstructions.td      | 64 +++++++++++++++++-----------------
>  lib/Target/R600/SILowerControlFlow.cpp |  2 +-
>  6 files changed, 79 insertions(+), 56 deletions(-)
> 
> diff --git a/lib/Target/R600/SIDefines.h b/lib/Target/R600/SIDefines.h
> index 7601794..b540140 100644
> --- a/lib/Target/R600/SIDefines.h
> +++ b/lib/Target/R600/SIDefines.h
> @@ -35,7 +35,8 @@ enum {
>    SMRD = 1 << 16,
>    DS = 1 << 17,
>    MIMG = 1 << 18,
> -  FLAT = 1 << 19
> +  FLAT = 1 << 19,
> +  WQM = 1 << 20
>  };
>  }
>  
> diff --git a/lib/Target/R600/SIInstrFormats.td b/lib/Target/R600/SIInstrFormats.td
> index 913a769..16a35ff 100644
> --- a/lib/Target/R600/SIInstrFormats.td
> +++ b/lib/Target/R600/SIInstrFormats.td
> @@ -38,6 +38,7 @@ class InstSI <dag outs, dag ins, string asm, list<dag> pattern> :
>    field bits<1> DS = 0;
>    field bits<1> MIMG = 0;
>    field bits<1> FLAT = 0;
> +  field bits<1> WQM = 0;
>  
>    // These need to be kept in sync with the enum in SIInstrFlags.
>    let TSFlags{0} = VM_CNT;
> @@ -64,6 +65,7 @@ class InstSI <dag outs, dag ins, string asm, list<dag> pattern> :
>    let TSFlags{17} = DS;
>    let TSFlags{18} = MIMG;
>    let TSFlags{19} = FLAT;
> +  let TSFlags{20} = WQM;
>  
>    // Most instructions require adjustments after selection to satisfy
>    // operand requirements.
> diff --git a/lib/Target/R600/SIInstrInfo.h b/lib/Target/R600/SIInstrInfo.h
> index 28cd27d..b25e35e 100644
> --- a/lib/Target/R600/SIInstrInfo.h
> +++ b/lib/Target/R600/SIInstrInfo.h
> @@ -204,6 +204,10 @@ public:
>      return get(Opcode).TSFlags & SIInstrFlags::FLAT;
>    }
>  
> +  bool isWQM(uint16_t Opcode) const {
> +    return get(Opcode).TSFlags & SIInstrFlags::WQM;
> +  }
> +
>    bool isInlineConstant(const APInt &Imm) const;
>    bool isInlineConstant(const MachineOperand &MO) const;
>    bool isLiteralConstant(const MachineOperand &MO) const;
> diff --git a/lib/Target/R600/SIInstrInfo.td b/lib/Target/R600/SIInstrInfo.td
> index 852870e..2b7fe78 100644
> --- a/lib/Target/R600/SIInstrInfo.td
> +++ b/lib/Target/R600/SIInstrInfo.td
> @@ -1920,7 +1920,7 @@ multiclass MIMG_NoSampler <bits<7> op, string asm> {
>  
>  class MIMG_Sampler_Helper <bits<7> op, string asm,
>                             RegisterClass dst_rc,
> -                           RegisterClass src_rc> : MIMG <
> +                           RegisterClass src_rc, int wqm> : MIMG <
>    op,
>    (outs dst_rc:$vdata),
>    (ins i32imm:$dmask, i1imm:$unorm, i1imm:$glc, i1imm:$da, i1imm:$r128,
> @@ -1932,33 +1932,41 @@ class MIMG_Sampler_Helper <bits<7> op, string asm,
>    let mayLoad = 1;
>    let mayStore = 0;
>    let hasPostISelHook = 1;
> +  let WQM = wqm;
>  }
>  
>  multiclass MIMG_Sampler_Src_Helper <bits<7> op, string asm,
>                                      RegisterClass dst_rc,
> -                                    int channels> {
> -  def _V1 : MIMG_Sampler_Helper <op, asm, dst_rc, VGPR_32>,
> +                                    int channels, int wqm> {
> +  def _V1 : MIMG_Sampler_Helper <op, asm, dst_rc, VGPR_32, wqm>,
>              MIMG_Mask<asm#"_V1", channels>;
> -  def _V2 : MIMG_Sampler_Helper <op, asm, dst_rc, VReg_64>,
> +  def _V2 : MIMG_Sampler_Helper <op, asm, dst_rc, VReg_64, wqm>,
>              MIMG_Mask<asm#"_V2", channels>;
> -  def _V4 : MIMG_Sampler_Helper <op, asm, dst_rc, VReg_128>,
> +  def _V4 : MIMG_Sampler_Helper <op, asm, dst_rc, VReg_128, wqm>,
>              MIMG_Mask<asm#"_V4", channels>;
> -  def _V8 : MIMG_Sampler_Helper <op, asm, dst_rc, VReg_256>,
> +  def _V8 : MIMG_Sampler_Helper <op, asm, dst_rc, VReg_256, wqm>,
>              MIMG_Mask<asm#"_V8", channels>;
> -  def _V16 : MIMG_Sampler_Helper <op, asm, dst_rc, VReg_512>,
> +  def _V16 : MIMG_Sampler_Helper <op, asm, dst_rc, VReg_512, wqm>,
>              MIMG_Mask<asm#"_V16", channels>;
>  }
>  
>  multiclass MIMG_Sampler <bits<7> op, string asm> {
> -  defm _V1 : MIMG_Sampler_Src_Helper<op, asm, VGPR_32, 1>;
> -  defm _V2 : MIMG_Sampler_Src_Helper<op, asm, VReg_64, 2>;
> -  defm _V3 : MIMG_Sampler_Src_Helper<op, asm, VReg_96, 3>;
> -  defm _V4 : MIMG_Sampler_Src_Helper<op, asm, VReg_128, 4>;
> +  defm _V1 : MIMG_Sampler_Src_Helper<op, asm, VGPR_32, 1, 0>;
> +  defm _V2 : MIMG_Sampler_Src_Helper<op, asm, VReg_64, 2, 0>;
> +  defm _V3 : MIMG_Sampler_Src_Helper<op, asm, VReg_96, 3, 0>;
> +  defm _V4 : MIMG_Sampler_Src_Helper<op, asm, VReg_128, 4, 0>;
> +}
> +
> +multiclass MIMG_Sampler_WQM <bits<7> op, string asm> {
> +  defm _V1 : MIMG_Sampler_Src_Helper<op, asm, VGPR_32, 1, 1>;
> +  defm _V2 : MIMG_Sampler_Src_Helper<op, asm, VReg_64, 2, 1>;
> +  defm _V3 : MIMG_Sampler_Src_Helper<op, asm, VReg_96, 3, 1>;
> +  defm _V4 : MIMG_Sampler_Src_Helper<op, asm, VReg_128, 4, 1>;
>  }
>  
>  class MIMG_Gather_Helper <bits<7> op, string asm,
>                            RegisterClass dst_rc,
> -                          RegisterClass src_rc> : MIMG <
> +                          RegisterClass src_rc, int wqm> : MIMG <
>    op,
>    (outs dst_rc:$vdata),
>    (ins i32imm:$dmask, i1imm:$unorm, i1imm:$glc, i1imm:$da, i1imm:$r128,
> @@ -1979,28 +1987,36 @@ class MIMG_Gather_Helper <bits<7> op, string asm,
>    // Therefore, disable all code which updates DMASK by setting these two:
>    let MIMG = 0;
>    let hasPostISelHook = 0;
> +  let WQM = wqm;
>  }
>  
>  multiclass MIMG_Gather_Src_Helper <bits<7> op, string asm,
>                                      RegisterClass dst_rc,
> -                                    int channels> {
> -  def _V1 : MIMG_Gather_Helper <op, asm, dst_rc, VGPR_32>,
> +                                    int channels, int wqm> {
> +  def _V1 : MIMG_Gather_Helper <op, asm, dst_rc, VGPR_32, wqm>,
>              MIMG_Mask<asm#"_V1", channels>;
> -  def _V2 : MIMG_Gather_Helper <op, asm, dst_rc, VReg_64>,
> +  def _V2 : MIMG_Gather_Helper <op, asm, dst_rc, VReg_64, wqm>,
>              MIMG_Mask<asm#"_V2", channels>;
> -  def _V4 : MIMG_Gather_Helper <op, asm, dst_rc, VReg_128>,
> +  def _V4 : MIMG_Gather_Helper <op, asm, dst_rc, VReg_128, wqm>,
>              MIMG_Mask<asm#"_V4", channels>;
> -  def _V8 : MIMG_Gather_Helper <op, asm, dst_rc, VReg_256>,
> +  def _V8 : MIMG_Gather_Helper <op, asm, dst_rc, VReg_256, wqm>,
>              MIMG_Mask<asm#"_V8", channels>;
> -  def _V16 : MIMG_Gather_Helper <op, asm, dst_rc, VReg_512>,
> +  def _V16 : MIMG_Gather_Helper <op, asm, dst_rc, VReg_512, wqm>,
>              MIMG_Mask<asm#"_V16", channels>;
>  }
>  
>  multiclass MIMG_Gather <bits<7> op, string asm> {
> -  defm _V1 : MIMG_Gather_Src_Helper<op, asm, VGPR_32, 1>;
> -  defm _V2 : MIMG_Gather_Src_Helper<op, asm, VReg_64, 2>;
> -  defm _V3 : MIMG_Gather_Src_Helper<op, asm, VReg_96, 3>;
> -  defm _V4 : MIMG_Gather_Src_Helper<op, asm, VReg_128, 4>;
> +  defm _V1 : MIMG_Gather_Src_Helper<op, asm, VGPR_32, 1, 0>;
> +  defm _V2 : MIMG_Gather_Src_Helper<op, asm, VReg_64, 2, 0>;
> +  defm _V3 : MIMG_Gather_Src_Helper<op, asm, VReg_96, 3, 0>;
> +  defm _V4 : MIMG_Gather_Src_Helper<op, asm, VReg_128, 4, 0>;
> +}
> +
> +multiclass MIMG_Gather_WQM <bits<7> op, string asm> {
> +  defm _V1 : MIMG_Gather_Src_Helper<op, asm, VGPR_32, 1, 1>;
> +  defm _V2 : MIMG_Gather_Src_Helper<op, asm, VReg_64, 2, 1>;
> +  defm _V3 : MIMG_Gather_Src_Helper<op, asm, VReg_96, 3, 1>;
> +  defm _V4 : MIMG_Gather_Src_Helper<op, asm, VReg_128, 4, 1>;
>  }
>  
>  //===----------------------------------------------------------------------===//
> diff --git a/lib/Target/R600/SIInstructions.td b/lib/Target/R600/SIInstructions.td
> index 544ea3a..e71bb33 100644
> --- a/lib/Target/R600/SIInstructions.td
> +++ b/lib/Target/R600/SIInstructions.td
> @@ -1034,63 +1034,63 @@ defm IMAGE_GET_RESINFO : MIMG_NoSampler <0x0000000e, "image_get_resinfo">;
>  //def IMAGE_ATOMIC_FCMPSWAP : MIMG_NoPattern_ <"image_atomic_fcmpswap", 0x0000001d>;
>  //def IMAGE_ATOMIC_FMIN : MIMG_NoPattern_ <"image_atomic_fmin", 0x0000001e>;
>  //def IMAGE_ATOMIC_FMAX : MIMG_NoPattern_ <"image_atomic_fmax", 0x0000001f>;
> -defm IMAGE_SAMPLE           : MIMG_Sampler <0x00000020, "image_sample">;
> -defm IMAGE_SAMPLE_CL        : MIMG_Sampler <0x00000021, "image_sample_cl">;
> +defm IMAGE_SAMPLE           : MIMG_Sampler_WQM <0x00000020, "image_sample">;
> +defm IMAGE_SAMPLE_CL        : MIMG_Sampler_WQM <0x00000021, "image_sample_cl">;
>  defm IMAGE_SAMPLE_D         : MIMG_Sampler <0x00000022, "image_sample_d">;
>  defm IMAGE_SAMPLE_D_CL      : MIMG_Sampler <0x00000023, "image_sample_d_cl">;
>  defm IMAGE_SAMPLE_L         : MIMG_Sampler <0x00000024, "image_sample_l">;
> -defm IMAGE_SAMPLE_B         : MIMG_Sampler <0x00000025, "image_sample_b">;
> -defm IMAGE_SAMPLE_B_CL      : MIMG_Sampler <0x00000026, "image_sample_b_cl">;
> +defm IMAGE_SAMPLE_B         : MIMG_Sampler_WQM <0x00000025, "image_sample_b">;
> +defm IMAGE_SAMPLE_B_CL      : MIMG_Sampler_WQM <0x00000026, "image_sample_b_cl">;
>  defm IMAGE_SAMPLE_LZ        : MIMG_Sampler <0x00000027, "image_sample_lz">;
> -defm IMAGE_SAMPLE_C         : MIMG_Sampler <0x00000028, "image_sample_c">;
> -defm IMAGE_SAMPLE_C_CL      : MIMG_Sampler <0x00000029, "image_sample_c_cl">;
> +defm IMAGE_SAMPLE_C         : MIMG_Sampler_WQM <0x00000028, "image_sample_c">;
> +defm IMAGE_SAMPLE_C_CL      : MIMG_Sampler_WQM <0x00000029, "image_sample_c_cl">;
>  defm IMAGE_SAMPLE_C_D       : MIMG_Sampler <0x0000002a, "image_sample_c_d">;
>  defm IMAGE_SAMPLE_C_D_CL    : MIMG_Sampler <0x0000002b, "image_sample_c_d_cl">;
>  defm IMAGE_SAMPLE_C_L       : MIMG_Sampler <0x0000002c, "image_sample_c_l">;
> -defm IMAGE_SAMPLE_C_B       : MIMG_Sampler <0x0000002d, "image_sample_c_b">;
> -defm IMAGE_SAMPLE_C_B_CL    : MIMG_Sampler <0x0000002e, "image_sample_c_b_cl">;
> +defm IMAGE_SAMPLE_C_B       : MIMG_Sampler_WQM <0x0000002d, "image_sample_c_b">;
> +defm IMAGE_SAMPLE_C_B_CL    : MIMG_Sampler_WQM <0x0000002e, "image_sample_c_b_cl">;
>  defm IMAGE_SAMPLE_C_LZ      : MIMG_Sampler <0x0000002f, "image_sample_c_lz">;
> -defm IMAGE_SAMPLE_O         : MIMG_Sampler <0x00000030, "image_sample_o">;
> -defm IMAGE_SAMPLE_CL_O      : MIMG_Sampler <0x00000031, "image_sample_cl_o">;
> +defm IMAGE_SAMPLE_O         : MIMG_Sampler_WQM <0x00000030, "image_sample_o">;
> +defm IMAGE_SAMPLE_CL_O      : MIMG_Sampler_WQM <0x00000031, "image_sample_cl_o">;
>  defm IMAGE_SAMPLE_D_O       : MIMG_Sampler <0x00000032, "image_sample_d_o">;
>  defm IMAGE_SAMPLE_D_CL_O    : MIMG_Sampler <0x00000033, "image_sample_d_cl_o">;
>  defm IMAGE_SAMPLE_L_O       : MIMG_Sampler <0x00000034, "image_sample_l_o">;
> -defm IMAGE_SAMPLE_B_O       : MIMG_Sampler <0x00000035, "image_sample_b_o">;
> -defm IMAGE_SAMPLE_B_CL_O    : MIMG_Sampler <0x00000036, "image_sample_b_cl_o">;
> +defm IMAGE_SAMPLE_B_O       : MIMG_Sampler_WQM <0x00000035, "image_sample_b_o">;
> +defm IMAGE_SAMPLE_B_CL_O    : MIMG_Sampler_WQM <0x00000036, "image_sample_b_cl_o">;
>  defm IMAGE_SAMPLE_LZ_O      : MIMG_Sampler <0x00000037, "image_sample_lz_o">;
> -defm IMAGE_SAMPLE_C_O       : MIMG_Sampler <0x00000038, "image_sample_c_o">;
> -defm IMAGE_SAMPLE_C_CL_O    : MIMG_Sampler <0x00000039, "image_sample_c_cl_o">;
> +defm IMAGE_SAMPLE_C_O       : MIMG_Sampler_WQM <0x00000038, "image_sample_c_o">;
> +defm IMAGE_SAMPLE_C_CL_O    : MIMG_Sampler_WQM <0x00000039, "image_sample_c_cl_o">;
>  defm IMAGE_SAMPLE_C_D_O     : MIMG_Sampler <0x0000003a, "image_sample_c_d_o">;
>  defm IMAGE_SAMPLE_C_D_CL_O  : MIMG_Sampler <0x0000003b, "image_sample_c_d_cl_o">;
>  defm IMAGE_SAMPLE_C_L_O     : MIMG_Sampler <0x0000003c, "image_sample_c_l_o">;
> -defm IMAGE_SAMPLE_C_B_O     : MIMG_Sampler <0x0000003d, "image_sample_c_b_o">;
> -defm IMAGE_SAMPLE_C_B_CL_O  : MIMG_Sampler <0x0000003e, "image_sample_c_b_cl_o">;
> +defm IMAGE_SAMPLE_C_B_O     : MIMG_Sampler_WQM <0x0000003d, "image_sample_c_b_o">;
> +defm IMAGE_SAMPLE_C_B_CL_O  : MIMG_Sampler_WQM <0x0000003e, "image_sample_c_b_cl_o">;
>  defm IMAGE_SAMPLE_C_LZ_O    : MIMG_Sampler <0x0000003f, "image_sample_c_lz_o">;
> -defm IMAGE_GATHER4          : MIMG_Gather <0x00000040, "image_gather4">;
> -defm IMAGE_GATHER4_CL       : MIMG_Gather <0x00000041, "image_gather4_cl">;
> +defm IMAGE_GATHER4          : MIMG_Gather_WQM <0x00000040, "image_gather4">;
> +defm IMAGE_GATHER4_CL       : MIMG_Gather_WQM <0x00000041, "image_gather4_cl">;
>  defm IMAGE_GATHER4_L        : MIMG_Gather <0x00000044, "image_gather4_l">;
> -defm IMAGE_GATHER4_B        : MIMG_Gather <0x00000045, "image_gather4_b">;
> -defm IMAGE_GATHER4_B_CL     : MIMG_Gather <0x00000046, "image_gather4_b_cl">;
> +defm IMAGE_GATHER4_B        : MIMG_Gather_WQM <0x00000045, "image_gather4_b">;
> +defm IMAGE_GATHER4_B_CL     : MIMG_Gather_WQM <0x00000046, "image_gather4_b_cl">;
>  defm IMAGE_GATHER4_LZ       : MIMG_Gather <0x00000047, "image_gather4_lz">;
> -defm IMAGE_GATHER4_C        : MIMG_Gather <0x00000048, "image_gather4_c">;
> -defm IMAGE_GATHER4_C_CL     : MIMG_Gather <0x00000049, "image_gather4_c_cl">;
> +defm IMAGE_GATHER4_C        : MIMG_Gather_WQM <0x00000048, "image_gather4_c">;
> +defm IMAGE_GATHER4_C_CL     : MIMG_Gather_WQM <0x00000049, "image_gather4_c_cl">;
>  defm IMAGE_GATHER4_C_L      : MIMG_Gather <0x0000004c, "image_gather4_c_l">;
> -defm IMAGE_GATHER4_C_B      : MIMG_Gather <0x0000004d, "image_gather4_c_b">;
> -defm IMAGE_GATHER4_C_B_CL   : MIMG_Gather <0x0000004e, "image_gather4_c_b_cl">;
> +defm IMAGE_GATHER4_C_B      : MIMG_Gather_WQM <0x0000004d, "image_gather4_c_b">;
> +defm IMAGE_GATHER4_C_B_CL   : MIMG_Gather_WQM <0x0000004e, "image_gather4_c_b_cl">;
>  defm IMAGE_GATHER4_C_LZ     : MIMG_Gather <0x0000004f, "image_gather4_c_lz">;
> -defm IMAGE_GATHER4_O        : MIMG_Gather <0x00000050, "image_gather4_o">;
> -defm IMAGE_GATHER4_CL_O     : MIMG_Gather <0x00000051, "image_gather4_cl_o">;
> +defm IMAGE_GATHER4_O        : MIMG_Gather_WQM <0x00000050, "image_gather4_o">;
> +defm IMAGE_GATHER4_CL_O     : MIMG_Gather_WQM <0x00000051, "image_gather4_cl_o">;
>  defm IMAGE_GATHER4_L_O      : MIMG_Gather <0x00000054, "image_gather4_l_o">;
> -defm IMAGE_GATHER4_B_O      : MIMG_Gather <0x00000055, "image_gather4_b_o">;
> +defm IMAGE_GATHER4_B_O      : MIMG_Gather_WQM <0x00000055, "image_gather4_b_o">;
>  defm IMAGE_GATHER4_B_CL_O   : MIMG_Gather <0x00000056, "image_gather4_b_cl_o">;
>  defm IMAGE_GATHER4_LZ_O     : MIMG_Gather <0x00000057, "image_gather4_lz_o">;
> -defm IMAGE_GATHER4_C_O      : MIMG_Gather <0x00000058, "image_gather4_c_o">;
> -defm IMAGE_GATHER4_C_CL_O   : MIMG_Gather <0x00000059, "image_gather4_c_cl_o">;
> +defm IMAGE_GATHER4_C_O      : MIMG_Gather_WQM <0x00000058, "image_gather4_c_o">;
> +defm IMAGE_GATHER4_C_CL_O   : MIMG_Gather_WQM <0x00000059, "image_gather4_c_cl_o">;
>  defm IMAGE_GATHER4_C_L_O    : MIMG_Gather <0x0000005c, "image_gather4_c_l_o">;
> -defm IMAGE_GATHER4_C_B_O    : MIMG_Gather <0x0000005d, "image_gather4_c_b_o">;
> -defm IMAGE_GATHER4_C_B_CL_O : MIMG_Gather <0x0000005e, "image_gather4_c_b_cl_o">;
> +defm IMAGE_GATHER4_C_B_O    : MIMG_Gather_WQM <0x0000005d, "image_gather4_c_b_o">;
> +defm IMAGE_GATHER4_C_B_CL_O : MIMG_Gather_WQM <0x0000005e, "image_gather4_c_b_cl_o">;
>  defm IMAGE_GATHER4_C_LZ_O   : MIMG_Gather <0x0000005f, "image_gather4_c_lz_o">;
> -defm IMAGE_GET_LOD          : MIMG_Sampler <0x00000060, "image_get_lod">;
> +defm IMAGE_GET_LOD          : MIMG_Sampler_WQM <0x00000060, "image_get_lod">;
>  defm IMAGE_SAMPLE_CD        : MIMG_Sampler <0x00000068, "image_sample_cd">;
>  defm IMAGE_SAMPLE_CD_CL     : MIMG_Sampler <0x00000069, "image_sample_cd_cl">;
>  defm IMAGE_SAMPLE_C_CD      : MIMG_Sampler <0x0000006a, "image_sample_c_cd">;
> diff --git a/lib/Target/R600/SILowerControlFlow.cpp b/lib/Target/R600/SILowerControlFlow.cpp
> index 068b22f..f014f2e 100644
> --- a/lib/Target/R600/SILowerControlFlow.cpp
> +++ b/lib/Target/R600/SILowerControlFlow.cpp
> @@ -447,7 +447,7 @@ bool SILowerControlFlowPass::runOnMachineFunction(MachineFunction &MF) {
>        Next = std::next(I);
>  
>        MachineInstr &MI = *I;
> -      if (TII->isDS(MI.getOpcode()))
> +      if (TII->isWQM(MI.getOpcode()) || TII->isDS(MI.getOpcode()))
>          NeedWQM = true;
> 
>        // Flat uses m0 in case it needs to access LDS.
> -- 
> 2.1.4
> 

> From a297851ca2b38cd5b86971f4364dbcfa8b8f88a9 Mon Sep 17 00:00:00 2001
> From: =?UTF-8?q?Michel=20D=C3=A4nzer?= <michel.daenzer at amd.com>
> Date: Thu, 29 Jan 2015 19:18:34 +0900
> Subject: [PATCH 2/2] R600/SI: Don't enable WQM for V_INTERP_* instructions
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
> 
> Doesn't seem necessary. I think this was mostly compensating for not
> enabling WQM for texture sampling instructions.
> 
> Signed-off-by: Michel D??nzer <michel.daenzer at amd.com>
> ---
>  lib/Target/R600/SILowerControlFlow.cpp | 6 ------
>  1 file changed, 6 deletions(-)
> 
> diff --git a/lib/Target/R600/SILowerControlFlow.cpp b/lib/Target/R600/SILowerControlFlow.cpp
> index f014f2e..2e08c9f 100644
> --- a/lib/Target/R600/SILowerControlFlow.cpp
> +++ b/lib/Target/R600/SILowerControlFlow.cpp
> @@ -513,12 +513,6 @@ bool SILowerControlFlowPass::runOnMachineFunction(MachineFunction &MF) {
>          case AMDGPU::SI_INDIRECT_DST_V16:
>            IndirectDst(MI);
>            break;
> -
> -        case AMDGPU::V_INTERP_P1_F32:
> -        case AMDGPU::V_INTERP_P2_F32:
> -        case AMDGPU::V_INTERP_MOV_F32:
> -          NeedWQM = true;
> -          break;
>        }
>      }
>    }
> -- 
> 2.1.4
>