[llvm] r305986 - [AMDGPU] SDWA: add support for GFX9 in peephole pass
Tom Stellard via llvm-commits
llvm-commits at lists.llvm.org
Fri Jun 23 12:42:23 PDT 2017
On 06/22/2017 02:26 AM, Sam Kolton via llvm-commits wrote:
> Author: skolton
> Date: Thu Jun 22 01:26:41 2017
> New Revision: 305986
>
> URL: http://llvm.org/viewvc/llvm-project?rev=305986&view=rev
> Log:
> [AMDGPU] SDWA: add support for GFX9 in peephole pass
>
> Summary:
> Added support based on merged SDWA pseudo instructions. Now peephole allow one scalar operand, omod and clamp modifiers.
> Added several subtarget features for GFX9 SDWA.
> This diff also contains changes from D34026.
> Depends D34026
>
> Reviewers: vpykhtin, rampitec, arsenm
>
> Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye
>
> Differential Revision: https://reviews.llvm.org/D34241
>
Hi Sam,
This patch causes an assertion failure in the attached test case.
To reproduce run:
/llc -march=amdgcn -mcpu=fiji -run-pass=si-peephole-sdwa -o - sdwa-crash.mir
-Tom
-------------- next part --------------
---
name: crash_v_cvt_u32_f32
tracksRegLiveness: true
registers:
- { id: 0, class: vgpr_32 }
- { id: 1, class: vgpr_32 }
- { id: 2, class: vgpr_32 }
- { id: 3, class: vgpr_32 }
- { id: 4, class: vgpr_32 }
- { id: 5, class: vgpr_32 }
- { id: 6, class: vgpr_32 }
body: |
bb.0:
liveins: %vgpr0, %vgpr1
%0 = COPY %vgpr0
%1 = COPY %vgpr1
%2 = V_CVT_U32_F32_e64 0, %0, 0, 0, implicit %exec
%3 = V_MOV_B32_e32 65535, implicit %exec
%4 = V_AND_B32_e64 %1, %3, implicit %exec
%5 = V_LSHLREV_B32_e64 16, %2, implicit %exec
%6 = V_OR_B32_e64 %4, %5, implicit %exec
...
More information about the llvm-commits
mailing list