[PATCH] R600: Make sign_extend_inreg legal.

Fri Apr 11 15:40:06 PDT 2014

On Thu, 2014-04-10 at 10:13 -0400, Tom Stellard wrote:
> On Tue, Apr 08, 2014 at 08:23:24PM -0400, Jan Vesely wrote:
> > Hi,
> > 
> > I finally got some time to improve the tests. My original issue with
> > *_generated piglits got fixed in the mean time. (My bets are it was
> > R600: Correct opcode for BFE_INT :).
> > 
> > I tried to use your latest patch to base the tests on but it did not
> > apply cleanly on master.
> > 
> > I have attached another patch(0001), that is based on your idea and
> > managed to get llvm do the scalarization, so no custom procedures are
> > necessary for EG hw.
> > 
> 
> These patches look good, I will commit them.

Thanks, I have attached a cleaned up v2 of the first patch. It moves the
code out of AMDGPU shared class, and makes it more obvious there is no
change for SI.

regards,
Jan

> 
> Thanks,
> Tom
> 
> > regards,
> > Jan
> > 
> > 
> > On Wed, 2014-04-02 at 16:09 -0700, Matt Arsenault wrote:
> > > On 04/02/2014 03:28 PM, Jan Vesely wrote:
> > > > Hi,
> > > >
> > > > On Tue, 2014-04-01 at 11:34 -0700, Matt Arsenault wrote:
> > > >> I don't know why I didn't just do this before
> > > >>
> > > >> http://llvm-reviews.chandlerc.com/D3250
> > > > I have two questions. I'm still learning llvm internals so pls bear with
> > > > me.
> > > >
> > > >> +  if (!VT.isVector())
> > > >> +    return SDValue();
> > > > Is it possible to call the function on scalar types?
> > > It's possible (and it was before), but now this might be unnecessary.
> > > 
> > > > I assume the default is Legal, and the patch only changes it to expand for i1, i8, i16.
> > > The default is legal, but this does the opposite. It lets i1, i8, and 
> > > i16 go back to the default
> > > 
> > > > Also given that the LowerSIGN_EXTEND_INREG, how is it different from using setOperationAction(Expand)?
> > > > Is it kept around because of the EG TODO?
> > > It's mostly kept around to scalarize it. IIRC the implementation for 
> > > expand sign_extend_inreg for vectors wasn't doing what I wanted and 
> > > creating a pair of vector shifts instead, although looking at what it 
> > > does now I think it should work. I can try it again using the default 
> > > expand.
> > > 
> > > >
> > > > thank you,
> > > > Jan
> > > >
> > > > PS: I apologize for not responding to that test request yet. These few weeks are rather busy for me.
> > > >
> > > >
> > > >> Files:
> > > >>    lib/Target/R600/AMDGPUISelLowering.cpp
> > > >>    lib/Target/R600/EvergreenInstructions.td
> > > >>    lib/Target/R600/SIISelLowering.cpp
> > > >>    lib/Target/R600/SIInstructions.td
> > > >> _______________________________________________
> > > >> llvm-commits mailing list
> > > >> llvm-commits at cs.uiuc.edu
> > > >> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> > > 
> > > 
> > 
> > -- 
> > Jan Vesely <jan.vesely at rutgers.edu>
> 
> > From ef59c66e2958a5d439a61804b7105b9b331d9af3 Mon Sep 17 00:00:00 2001
> > From: Jan Vesely <jan.vesely at rutgers.edu>
> > Date: Mon, 7 Apr 2014 17:49:18 -0400
> > Subject: [PATCH 1/2] R600: Expand sign extension of vectors.
> > 
> > Setting vector types to expand will result in scalarization on pre SI hw,
> > as those gpus don't have vector shifts either.
> > Expand also i32 vectors, this helps llvm make the correct decision
> > about scalarizing the vector ops.
> > 
> > This is based on Matt Arsenault's patch R600: Make sign_extend_inreg legal
> > 
> > Signed-off-by: Jan Vesely <jan.vesely at rutgers.edu>
> > ---
> >  lib/Target/R600/AMDGPUISelLowering.cpp   | 27 ++++++++++++++++-----------
> >  lib/Target/R600/EvergreenInstructions.td |  7 +++++++
> >  lib/Target/R600/SIISelLowering.cpp       |  9 +++++++++
> >  3 files changed, 32 insertions(+), 11 deletions(-)
> > 
> > diff --git a/lib/Target/R600/AMDGPUISelLowering.cpp b/lib/Target/R600/AMDGPUISelLowering.cpp
> > index 1fed068..a44dc41 100644
> > --- a/lib/Target/R600/AMDGPUISelLowering.cpp
> > +++ b/lib/Target/R600/AMDGPUISelLowering.cpp
> > @@ -211,22 +211,27 @@ AMDGPUTargetLowering::AMDGPUTargetLowering(TargetMachine &TM) :
> >      setOperationAction(ISD::FSUB, VT, Expand);
> >      setOperationAction(ISD::SELECT, VT, Expand);
> >    }
> > +  if (!Subtarget->hasBFE())
> > +    setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i1, Expand);
> >  
> > -  setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i1, Custom);
> > -  setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::v2i1, Custom);
> > -  setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::v4i1, Custom);
> > +  setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::v2i1, Expand);
> > +  setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::v4i1, Expand);
> >  
> > -  setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i8, Custom);
> > -  setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::v2i8, Custom);
> > -  setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::v4i8, Custom);
> > +  if (!Subtarget->hasBFE())
> > +    setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i8, Expand);
> > +  setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::v2i8, Expand);
> > +  setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::v4i8, Expand);
> >  
> > -  setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i16, Custom);
> > -  setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::v2i16, Custom);
> > -  setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::v4i16, Custom);
> > +  if (!Subtarget->hasBFE())
> > +    setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i16, Expand);
> > +  setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::v2i16, Expand);
> > +  setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::v4i16, Expand);
> >  
> > -  setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i32, Custom);
> > +  setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i32, Legal);
> > +  setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::v2i32, Expand);
> > +  setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::v4i32, Expand);
> >  
> > -  setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::Other, Custom);
> > +  setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::Other, Expand);
> >  
> >    setTargetDAGCombine(ISD::MUL);
> >  }
> > diff --git a/lib/Target/R600/EvergreenInstructions.td b/lib/Target/R600/EvergreenInstructions.td
> > index 7153b70..d9931c8 100644
> > --- a/lib/Target/R600/EvergreenInstructions.td
> > +++ b/lib/Target/R600/EvergreenInstructions.td
> > @@ -286,6 +286,13 @@ def BFI_INT_eg : R600_3OP <0x06, "BFI_INT",
> >    VecALU
> >  >;
> >  
> > +def : Pat<(i32 (sext_inreg i32:$src, i1)),
> > +  (BFE_INT_eg i32:$src, (i32 ZERO), (i32 ONE_INT))>;
> > +def : Pat<(i32 (sext_inreg i32:$src, i8)),
> > +  (BFE_INT_eg i32:$src, (i32 ZERO), (MOV_IMM_I32 8))>;
> > +def : Pat<(i32 (sext_inreg i32:$src, i16)),
> > +  (BFE_INT_eg i32:$src, (i32 ZERO), (MOV_IMM_I32 16))>;
> > +
> >  defm : BFIPatterns <BFI_INT_eg>;
> >  
> >  def BFM_INT_eg : R600_2OP <0xA0, "BFM_INT",
> > diff --git a/lib/Target/R600/SIISelLowering.cpp b/lib/Target/R600/SIISelLowering.cpp
> > index b9295ff..9180146 100644
> > --- a/lib/Target/R600/SIISelLowering.cpp
> > +++ b/lib/Target/R600/SIISelLowering.cpp
> > @@ -119,6 +119,15 @@ SITargetLowering::SITargetLowering(TargetMachine &TM) :
> >    setOperationAction(ISD::SIGN_EXTEND, MVT::i64, Custom);
> >    setOperationAction(ISD::ZERO_EXTEND, MVT::i64, Custom);
> >  
> > +  setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i1, Custom);
> > +  setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i8, Custom);
> > +  setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i16, Custom);
> > +  setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i32, Custom);
> > +
> > +  for (int i = MVT::FIRST_INTEGER_VECTOR_VALUETYPE;
> > +           i < MVT::LAST_INTEGER_VECTOR_VALUETYPE; ++i)
> > +    setOperationAction(ISD::SIGN_EXTEND_INREG, (MVT::SimpleValueType)i, Custom);
> > +
> >    setOperationAction(ISD::INTRINSIC_WO_CHAIN, MVT::Other, Custom);
> >    setOperationAction(ISD::INTRINSIC_WO_CHAIN, MVT::f32, Custom);
> >    setOperationAction(ISD::INTRINSIC_WO_CHAIN, MVT::v16i8, Custom);
> > -- 
> > 1.9.0
> > 
> 
> > From 8a6f5da0cdc0c2515684cd218d68a2db14f6bb1c Mon Sep 17 00:00:00 2001
> > From: Jan Vesely <jan.vesely at rutgers.edu>
> > Date: Tue, 8 Apr 2014 19:43:30 -0400
> > Subject: [PATCH 2/2] R600, tests: Extend r600 sign_extend_inreg tests for EG
> > 
> > Signed-off-by: Jan Vesely <jan.vesely at rutgers.edu>
> > ---
> >  test/CodeGen/R600/sext-in-reg.ll | 107 +++++++++++++++++++++++++++++++--------
> >  1 file changed, 85 insertions(+), 22 deletions(-)
> > 
> > diff --git a/test/CodeGen/R600/sext-in-reg.ll b/test/CodeGen/R600/sext-in-reg.ll
> > index eef3f07..ac05f5e 100644
> > --- a/test/CodeGen/R600/sext-in-reg.ll
> > +++ b/test/CodeGen/R600/sext-in-reg.ll
> > @@ -9,7 +9,9 @@ declare i32 @llvm.AMDGPU.imax(i32, i32) nounwind readnone
> >  ; SI: V_BFE_I32 [[EXTRACT:v[0-9]+]], [[ARG]], 0, 1
> >  ; SI: BUFFER_STORE_DWORD [[EXTRACT]],
> >  
> > -; EG: BFE_INT
> > +; EG: MEM_{{.*}} STORE_{{.*}} [[RES:T[0-9]+\.[XYZW]]], [[ADDR:T[0-9]+.[XYZW]]]
> > +; EG: BFE_INT [[RES]], {{.*}}, 0.0, 1
> > +; EG-NEXT: LSHR * [[ADDR]]
> >  define void @sext_in_reg_i1_i32(i32 addrspace(1)* %out, i32 %in) {
> >    %shl = shl i32 %in, 31
> >    %sext = ashr i32 %shl, 31
> > @@ -22,7 +24,10 @@ define void @sext_in_reg_i1_i32(i32 addrspace(1)* %out, i32 %in) {
> >  ; SI: V_BFE_I32 [[EXTRACT:v[0-9]+]], [[VAL]], 0, 8
> >  ; SI: BUFFER_STORE_DWORD [[EXTRACT]],
> >  
> > -; EG: BFE_INT
> > +; EG: MEM_{{.*}} STORE_{{.*}} [[RES:T[0-9]+\.[XYZW]]], [[ADDR:T[0-9]+.[XYZW]]]
> > +; EG: ADD_INT
> > +; EG-NEXT: BFE_INT [[RES]], {{.*}}, 0.0, literal
> > +; EG-NEXT: LSHR * [[ADDR]]
> >  define void @sext_in_reg_i8_to_i32(i32 addrspace(1)* %out, i32 %a, i32 %b) nounwind {
> >    %c = add i32 %a, %b ; add to prevent folding into extload
> >    %shl = shl i32 %c, 24
> > @@ -36,7 +41,10 @@ define void @sext_in_reg_i8_to_i32(i32 addrspace(1)* %out, i32 %a, i32 %b) nounw
> >  ; SI: V_BFE_I32 [[EXTRACT:v[0-9]+]], [[VAL]], 0, 16
> >  ; SI: BUFFER_STORE_DWORD [[EXTRACT]],
> >  
> > -; EG: BFE_INT
> > +; EG: MEM_{{.*}} STORE_{{.*}} [[RES:T[0-9]+\.[XYZW]]], [[ADDR:T[0-9]+.[XYZW]]]
> > +; EG: ADD_INT
> > +; EG-NEXT: BFE_INT [[RES]], {{.*}}, 0.0, literal
> > +; EG-NEXT: LSHR * [[ADDR]]
> >  define void @sext_in_reg_i16_to_i32(i32 addrspace(1)* %out, i32 %a, i32 %b) nounwind {
> >    %c = add i32 %a, %b ; add to prevent folding into extload
> >    %shl = shl i32 %c, 16
> > @@ -50,7 +58,10 @@ define void @sext_in_reg_i16_to_i32(i32 addrspace(1)* %out, i32 %a, i32 %b) noun
> >  ; SI: V_BFE_I32 [[EXTRACT:v[0-9]+]], [[VAL]], 0, 8
> >  ; SI: BUFFER_STORE_DWORD [[EXTRACT]],
> >  
> > -; EG: BFE_INT
> > +; EG: MEM_{{.*}} STORE_{{.*}} [[RES:T[0-9]+\.[XYZW]]], [[ADDR:T[0-9]+.[XYZW]]]
> > +; EG: ADD_INT
> > +; EG-NEXT: BFE_INT [[RES]], {{.*}}, 0.0, literal
> > +; EG-NEXT: LSHR * [[ADDR]]
> >  define void @sext_in_reg_i8_to_v1i32(<1 x i32> addrspace(1)* %out, <1 x i32> %a, <1 x i32> %b) nounwind {
> >    %c = add <1 x i32> %a, %b ; add to prevent folding into extload
> >    %shl = shl <1 x i32> %c, <i32 24>
> > @@ -64,8 +75,16 @@ define void @sext_in_reg_i8_to_v1i32(<1 x i32> addrspace(1)* %out, <1 x i32> %a,
> >  ; SI: V_ASHRREV_I32_e32 {{v[0-9]+}}, 31,
> >  ; SI: BUFFER_STORE_DWORD
> >  
> > -; EG: BFE_INT
> > -; EG: ASHR
> > +; EG: MEM_{{.*}} STORE_{{.*}} [[RES_LO:T[0-9]+\.[XYZW]]], [[ADDR_LO:T[0-9]+.[XYZW]]]
> > +; EG: MEM_{{.*}} STORE_{{.*}} [[RES_HI:T[0-9]+\.[XYZW]]], [[ADDR_HI:T[0-9]+.[XYZW]]]
> > +; EG: ADD_INT
> > +; EG-NEXT: BFE_INT {{\*?}} [[RES_LO]], {{.*}}, 0.0, literal
> > +; EG: ASHR [[RES_HI]]
> > +; EG-NOT: BFE_INT
> > +; EG: LSHR
> > +; EG: LSHR
> > +;; TODO Check address computation, using | with variables in {{}} does not work,
> > +;; also the _LO/_HI order might be different
> >  define void @sext_in_reg_i8_to_i64(i64 addrspace(1)* %out, i64 %a, i64 %b) nounwind {
> >    %c = add i64 %a, %b
> >    %shl = shl i64 %c, 56
> > @@ -79,8 +98,16 @@ define void @sext_in_reg_i8_to_i64(i64 addrspace(1)* %out, i64 %a, i64 %b) nounw
> >  ; SI: V_ASHRREV_I32_e32 {{v[0-9]+}}, 31,
> >  ; SI: BUFFER_STORE_DWORD
> >  
> > -; EG: BFE_INT
> > -; EG: ASHR
> > +; EG: MEM_{{.*}} STORE_{{.*}} [[RES_LO:T[0-9]+\.[XYZW]]], [[ADDR_LO:T[0-9]+.[XYZW]]]
> > +; EG: MEM_{{.*}} STORE_{{.*}} [[RES_HI:T[0-9]+\.[XYZW]]], [[ADDR_HI:T[0-9]+.[XYZW]]]
> > +; EG: ADD_INT
> > +; EG-NEXT: BFE_INT {{\*?}} [[RES_LO]], {{.*}}, 0.0, literal
> > +; EG: ASHR [[RES_HI]]
> > +; EG-NOT: BFE_INT
> > +; EG: LSHR
> > +; EG: LSHR
> > +;; TODO Check address computation, using | with variables in {{}} does not work,
> > +;; also the _LO/_HI order might be different
> >  define void @sext_in_reg_i16_to_i64(i64 addrspace(1)* %out, i64 %a, i64 %b) nounwind {
> >    %c = add i64 %a, %b
> >    %shl = shl i64 %c, 48
> > @@ -95,6 +122,17 @@ define void @sext_in_reg_i16_to_i64(i64 addrspace(1)* %out, i64 %a, i64 %b) noun
> >  ; SI: S_ADD_I32 [[ADD:s[0-9]+]],
> >  ; SI: S_ASHR_I32 s{{[0-9]+}}, [[ADD]], 31
> >  ; SI: BUFFER_STORE_DWORDX2
> > +
> > +; EG: MEM_{{.*}} STORE_{{.*}} [[RES_LO:T[0-9]+\.[XYZW]]], [[ADDR_LO:T[0-9]+.[XYZW]]]
> > +; EG: MEM_{{.*}} STORE_{{.*}} [[RES_HI:T[0-9]+\.[XYZW]]], [[ADDR_HI:T[0-9]+.[XYZW]]]
> > +; EG-NOT: BFE_INT
> > +; EG: ADD_INT {{\*?}} [[RES_LO]]
> > +; EG: ASHR [[RES_HI]]
> > +; EG: ADD_INT
> > +; EG: LSHR
> > +; EG: LSHR
> > +;; TODO Check address computation, using | with variables in {{}} does not work,
> > +;; also the _LO/_HI order might be different
> >  define void @sext_in_reg_i32_to_i64(i64 addrspace(1)* %out, i64 %a, i64 %b) nounwind {
> >    %c = add i64 %a, %b
> >    %shl = shl i64 %c, 32
> > @@ -122,7 +160,13 @@ define void @sext_in_reg_i32_to_i64(i64 addrspace(1)* %out, i64 %a, i64 %b) noun
> >  ; SI-NOT: BFE
> >  ; SI: S_LSHL_B32 [[REG:s[0-9]+]], {{s[0-9]+}}, 6
> >  ; SI: S_ASHR_I32 {{s[0-9]+}}, [[REG]], 7
> > +
> > +; EG: MEM_{{.*}} STORE_{{.*}} [[RES:T[0-9]+\.[XYZW]]], [[ADDR:T[0-9]+.[XYZW]]]
> >  ; EG-NOT: BFE
> > +; EG: ADD_INT
> > +; EG: LSHL
> > +; EG: ASHR [[RES]]
> > +; EG: LSHR {{\*?}} [[ADDR]]
> >  define void @sext_in_reg_i1_in_i32_other_amount(i32 addrspace(1)* %out, i32 %a, i32 %b) nounwind {
> >    %c = add i32 %a, %b
> >    %x = shl i32 %c, 6
> > @@ -136,7 +180,15 @@ define void @sext_in_reg_i1_in_i32_other_amount(i32 addrspace(1)* %out, i32 %a,
> >  ; SI: S_ASHR_I32 {{s[0-9]+}}, [[REG0]], 7
> >  ; SI: S_LSHL_B32 [[REG1:s[0-9]+]], {{s[0-9]}}, 6
> >  ; SI: S_ASHR_I32 {{s[0-9]+}}, [[REG1]], 7
> > +
> > +; EG: MEM_{{.*}} STORE_{{.*}} [[RES:T[0-9]+]]{{\.[XYZW][XYZW]}}, [[ADDR:T[0-9]+.[XYZW]]]
> >  ; EG-NOT: BFE
> > +; EG: ADD_INT
> > +; EG: LSHL
> > +; EG: ASHR [[RES]]
> > +; EG: LSHL
> > +; EG: ASHR [[RES]]
> > +; EG: LSHR {{\*?}} [[ADDR]]
> >  define void @sext_in_reg_v2i1_in_v2i32_other_amount(<2 x i32> addrspace(1)* %out, <2 x i32> %a, <2 x i32> %b) nounwind {
> >    %c = add <2 x i32> %a, %b
> >    %x = shl <2 x i32> %c, <i32 6, i32 6>
> > @@ -150,8 +202,11 @@ define void @sext_in_reg_v2i1_in_v2i32_other_amount(<2 x i32> addrspace(1)* %out
> >  ; SI: V_BFE_I32 {{v[0-9]+}}, {{s[0-9]+}}, 0, 1
> >  ; SI: V_BFE_I32 {{v[0-9]+}}, {{s[0-9]+}}, 0, 1
> >  ; SI: BUFFER_STORE_DWORDX2
> > -; EG: BFE
> > -; EG: BFE
> > +
> > +; EG: MEM_{{.*}} STORE_{{.*}} [[RES:T[0-9]+]]{{\.[XYZW][XYZW]}}, [[ADDR:T[0-9]+.[XYZW]]]
> > +; EG: BFE_INT [[RES]]
> > +; EG: BFE_INT [[RES]]
> > +; EG: LSHR {{\*?}} [[ADDR]]
> >  define void @sext_in_reg_v2i1_to_v2i32(<2 x i32> addrspace(1)* %out, <2 x i32> %a, <2 x i32> %b) nounwind {
> >    %c = add <2 x i32> %a, %b ; add to prevent folding into extload
> >    %shl = shl <2 x i32> %c, <i32 31, i32 31>
> > @@ -167,10 +222,12 @@ define void @sext_in_reg_v2i1_to_v2i32(<2 x i32> addrspace(1)* %out, <2 x i32> %
> >  ; SI: V_BFE_I32 {{v[0-9]+}}, {{s[0-9]+}}, 0, 1
> >  ; SI: BUFFER_STORE_DWORDX4
> >  
> > -; EG: BFE
> > -; EG: BFE
> > -; EG: BFE
> > -; EG: BFE
> > +; EG: MEM_{{.*}} STORE_{{.*}} [[RES:T[0-9]+]]{{\.[XYZW][XYZW][XYZW][XYZW]}}, [[ADDR:T[0-9]+.[XYZW]]]
> > +; EG: BFE_INT [[RES]]
> > +; EG: BFE_INT [[RES]]
> > +; EG: BFE_INT [[RES]]
> > +; EG: BFE_INT [[RES]]
> > +; EG: LSHR {{\*?}} [[ADDR]]
> >  define void @sext_in_reg_v4i1_to_v4i32(<4 x i32> addrspace(1)* %out, <4 x i32> %a, <4 x i32> %b) nounwind {
> >    %c = add <4 x i32> %a, %b ; add to prevent folding into extload
> >    %shl = shl <4 x i32> %c, <i32 31, i32 31, i32 31, i32 31>
> > @@ -184,8 +241,10 @@ define void @sext_in_reg_v4i1_to_v4i32(<4 x i32> addrspace(1)* %out, <4 x i32> %
> >  ; SI: V_BFE_I32 {{v[0-9]+}}, {{s[0-9]+}}, 0, 8
> >  ; SI: BUFFER_STORE_DWORDX2
> >  
> > -; EG: BFE
> > -; EG: BFE
> > +; EG: MEM_{{.*}} STORE_{{.*}} [[RES:T[0-9]+]]{{\.[XYZW][XYZW]}}, [[ADDR:T[0-9]+.[XYZW]]]
> > +; EG: BFE_INT [[RES]]
> > +; EG: BFE_INT [[RES]]
> > +; EG: LSHR {{\*?}} [[ADDR]]
> >  define void @sext_in_reg_v2i8_to_v2i32(<2 x i32> addrspace(1)* %out, <2 x i32> %a, <2 x i32> %b) nounwind {
> >    %c = add <2 x i32> %a, %b ; add to prevent folding into extload
> >    %shl = shl <2 x i32> %c, <i32 24, i32 24>
> > @@ -201,10 +260,12 @@ define void @sext_in_reg_v2i8_to_v2i32(<2 x i32> addrspace(1)* %out, <2 x i32> %
> >  ; SI: V_BFE_I32 {{v[0-9]+}}, {{s[0-9]+}}, 0, 8
> >  ; SI: BUFFER_STORE_DWORDX4
> >  
> > -; EG: BFE
> > -; EG: BFE
> > -; EG: BFE
> > -; EG: BFE
> > +; EG: MEM_{{.*}} STORE_{{.*}} [[RES:T[0-9]+]]{{\.[XYZW][XYZW][XYZW][XYZW]}}, [[ADDR:T[0-9]+.[XYZW]]]
> > +; EG: BFE_INT [[RES]]
> > +; EG: BFE_INT [[RES]]
> > +; EG: BFE_INT [[RES]]
> > +; EG: BFE_INT [[RES]]
> > +; EG: LSHR {{\*?}} [[ADDR]]
> >  define void @sext_in_reg_v4i8_to_v4i32(<4 x i32> addrspace(1)* %out, <4 x i32> %a, <4 x i32> %b) nounwind {
> >    %c = add <4 x i32> %a, %b ; add to prevent folding into extload
> >    %shl = shl <4 x i32> %c, <i32 24, i32 24, i32 24, i32 24>
> > @@ -218,8 +279,10 @@ define void @sext_in_reg_v4i8_to_v4i32(<4 x i32> addrspace(1)* %out, <4 x i32> %
> >  ; SI: V_BFE_I32 {{v[0-9]+}}, {{s[0-9]+}}, 0, 8
> >  ; SI: BUFFER_STORE_DWORDX2
> >  
> > -; EG: BFE
> > -; EG: BFE
> > +; EG: MEM_{{.*}} STORE_{{.*}} [[RES:T[0-9]+]]{{\.[XYZW][XYZW]}}, [[ADDR:T[0-9]+.[XYZW]]]
> > +; EG: BFE_INT [[RES]]
> > +; EG: BFE_INT [[RES]]
> > +; EG: LSHR {{\*?}} [[ADDR]]
> >  define void @sext_in_reg_v2i16_to_v2i32(<2 x i32> addrspace(1)* %out, <2 x i32> %a, <2 x i32> %b) nounwind {
> >    %c = add <2 x i32> %a, %b ; add to prevent folding into extload
> >    %shl = shl <2 x i32> %c, <i32 24, i32 24>
> > -- 
> > 1.9.0
> > 
> 
> 
> 
> 
> > _______________________________________________
> > llvm-commits mailing list
> > llvm-commits at cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> 

-- 
Jan Vesely <jan.vesely at rutgers.edu>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-R600-Expand-sign-extension-of-vectors-v2.patch
Type: text/x-patch
Size: 5865 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140411/2f980d16/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: This is a digitally signed message part
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140411/2f980d16/attachment.sig>