[PATCH] [AArch64] Support ISD::SIGN_EXTEND_INREG

Jiangning Liu liujiangning1 at gmail.com
Tue Jan 7 02:38:31 PST 2014


Ana,

I see your point now.

Actually
with my patch 
sign_extend_inreg(v8i16, v8i8) can generate SXTL(8b->8h) as shown with my
test case below,

define <8 x i8> @test_sext_inreg_v8i8i16(<8 x i8> %v1, <8 x i8> %v2)
nounwind readnone {
; CHECK-LABEL: test_sext_inreg_v8i8i16
; CHECK: sshll   v0.8h, v0.8b, #0
; CHECK: sshll   v1.8h, v1.8b, #0
  %1 = sext <8 x i8> %v1 to <8 x i16>
  %2 = sext <8 x i8> %v2 to <8 x i16>
  %3 = shufflevector <8 x i16> %1, <8 x i16> %2, <8 x i32> <i32 0, i32 2,
i32 4, i32 6, i32 8, i32 10, i32 12, i32 14>
  %4 = trunc <8 x i16> %3 to <8 x i8>
  ret <8 x i8> %4
}

And 
sign_extend_inreg(v2i64, v2i32) doesn't exist, because we always use
sign_extend(v2i64, v2i32) to solve it as shown by test case below,

define <2 x i32> @test_sext_inreg_v2i32i64(<2 x i32> %v1, <2 x i32> %v2)
nounwind readnone {
; CHECK-LABEL: test_sext_inreg_v2i32i64
; CHECK: sshll v0.2d, v0.2s, #0
; CHECK: sshll v1.2d, v1.2s, #0
  %1 = sext <2 x i32> %v1 to <2 x i64>
  %2 = sext <2 x i32> %v2 to <2 x i64>
  %3 = shufflevector <2 x i64> %1, <2 x i64> %2, <2 x i32> <i32 0, i32 2>
  %4 = trunc <2 x i64> %3 to <2 x i32>
  ret <2 x i32> %4
}

However, yes
, 
sign_extend_inreg(v2i32, v2i16) would be an issue, so I modified my patch
as attached and changed the test test_sext_inreg_v2i16i32 to be like below
by using sshll instruction.

define <2 x i16> @test_sext_inreg_v2i16i32(<2 x i16> %v1, <2 x i16> %v2)
nounwind readnone {
; CHECK-LABEL: test_sext_inreg_v2i16i32
; CHECK: sshll   v0.4s, v0.4h, #0
; CHECK: sshll   v1.4s, v1.4h, #0
  %1 = sext <2 x i16> %v1 to <2 x i32>
  %2 = sext <2 x i16> %v2 to <2 x i32>
  %3 = shufflevector <2 x i32> %1, <2 x i32> %2, <2 x i32> <i32 0, i32 2>
  %4 = trunc <2 x i32> %3 to <2 x i16>
  ret <2 x i16> %4
}

The solution is by doing combine to capture this special sha/shl pair. Do
we have more missing cases?

Thanks,
-Jiangning



2014/1/7 Ana Pazos <apazos at codeaurora.org>

> Hi Jiangning,
>
>
>
> The test cases I see failure are
>
> sign_extend_inreg(v2i32, v2i16) and
>
> sign_extend_inreg(v4i16, v4i8)     - sorry I had a typo v8i8 but I meant
> v4i8 which confused you.
>
>
>
> So it seems your patch addresses both cases I was concerned about.
>
>
>
> But for such cases I think the SXTL instruction could be used instead of
> the combo shift right + shift left.
>
>
>
> For example sign_extend_inreg(v2i32, v2i16):
>
> -        Input are 16-bit values in a 2S register
>
> -        Reinterpret register as 4H register
>
> -        SXTL (4S <– 4H)
>
> -        Ins/uzp1 (to extract the vector indexes 0, 2 we need into a 2S
> register)
>
>
>
> The same can be done for sign_extend_inreg(v8i16, v8i8) and
> sign_extend_inreg(v2i64, v2i32).
>
>
>
> I think in some cases the extraction of vector indexes we are interested
> in will be a no-op and an instruction will be saved.
>
>
>
> I am just suggesting to use a hardware instruction that does the sign
> extension for those vector types it supports.
>
>
>
> Do you agree?
>
>
>
> Thanks,
>
> Ana.
>
>
>
> *From:* Jiangning Liu [mailto:liujiangning1 at gmail.com]
> *Sent:* Sunday, January 05, 2014 10:44 PM
> *To:* Ana Pazos
> *Cc:* llvm-commits at cs.uiuc.edu for LLVM; mcrosier at codeaurora.org
> *Subject:* Re: [PATCH] [AArch64] Support ISD::SIGN_EXTEND_INREG
>
>
>
> Hi
>
> Ana,
>
> Sorry, I don't quite understand what you said. Do you have a small test to
> articulate what you mentioned?
>
> For sign_extend_inreg(v2i32, v2i16), my test case below should show my
> patch work,
>
> define <2 x i16> @test_sext_inreg_v2i16i32(<2 x i16> %v1, <2 x i16> %v2)
> nounwind readnone {
> ; CHECK-LABEL: test_sext_inreg_v2i16i32
> ; CHECK: shl     v0.2s, v0.2s, #16
> ; CHECK: sshr    v0.2s, v0.2s, #16
> ; CHECK: shl     v1.2s, v1.2s, #16
> ; CHECK: sshr    v1.2s, v1.2s, #16
>   %1 = sext <2 x i16> %v1 to <2 x i32>
>   %2 = sext <2 x i16> %v2 to <2 x i32>
>   %3 = shufflevector <2 x i32> %1, <2 x i32> %2, <2 x i32> <i32 0, i32 2>
>   %4 = trunc <2 x i32> %3 to <2 x i16>
>   ret <2 x i16> %4
> }
>
> For sign_extend_inreg(v4i16, v8i8), is this a valid? I thought it should
> be sign_extend_inreg(v8i16, v8i8). If this is the case, my test below
> should also show my patch work,
>
> define <8 x i8> @test_sext_inreg_v8i8i16(<8 x i8> %v1, <8 x i8> %v2)
> nounwind readnone {
> ; CHECK-LABEL: test_sext_inreg_v8i8i16
> ; CHECK: sshll   v0.8h, v0.8b, #0
> ; CHECK: sshll   v1.8h, v1.8b, #0
>   %1 = sext <8 x i8> %v1 to <8 x i16>
>   %2 = sext <8 x i8> %v2 to <8 x i16>
>   %3 = shufflevector <8 x i16> %1, <8 x i16> %2, <8 x i32> <i32 0, i32 2,
> i32 4, i32 6, i32 8, i32 10, i32 12, i32 14>
>   %4 = trunc <8 x i16> %3 to <8 x i8>
>   ret <8 x i8> %4
> }
>
> Thanks,
> -Jiangning
>



-- 
Thanks,
-Jiangning
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140107/94acf74b/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sext_inreg_llvm_2.patch
Type: application/octet-stream
Size: 9941 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140107/94acf74b/attachment.obj>


More information about the llvm-commits mailing list