[llvm] r305986 - [AMDGPU] SDWA: add support for GFX9 in peephole pass

Tom Stellard via llvm-commits llvm-commits at lists.llvm.org
Fri Jun 23 12:42:23 PDT 2017


On 06/22/2017 02:26 AM, Sam Kolton via llvm-commits wrote:
> Author: skolton
> Date: Thu Jun 22 01:26:41 2017
> New Revision: 305986
> 
> URL: http://llvm.org/viewvc/llvm-project?rev=305986&view=rev
> Log:
> [AMDGPU] SDWA: add support for GFX9 in peephole pass
> 
> Summary:
> Added support based on merged SDWA pseudo instructions. Now peephole allow one scalar operand, omod and clamp modifiers.
> Added several subtarget features for GFX9 SDWA.
> This diff also contains changes from D34026.
> Depends D34026
> 
> Reviewers: vpykhtin, rampitec, arsenm
> 
> Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye
> 
> Differential Revision: https://reviews.llvm.org/D34241
> 

Hi Sam,

This patch causes an assertion failure in the attached test case.
To reproduce run:

/llc -march=amdgcn -mcpu=fiji -run-pass=si-peephole-sdwa -o - sdwa-crash.mir

-Tom

-------------- next part --------------

---
name: crash_v_cvt_u32_f32
tracksRegLiveness: true
registers:
  - { id: 0, class: vgpr_32 }
  - { id: 1, class: vgpr_32 }
  - { id: 2, class: vgpr_32 }
  - { id: 3, class: vgpr_32 }
  - { id: 4, class: vgpr_32 }
  - { id: 5, class: vgpr_32 }
  - { id: 6, class: vgpr_32 }

body: |
  bb.0:
    liveins: %vgpr0, %vgpr1

    %0 = COPY %vgpr0
    %1 = COPY %vgpr1
    %2 = V_CVT_U32_F32_e64 0, %0, 0, 0, implicit %exec
    %3 = V_MOV_B32_e32 65535, implicit %exec
    %4 = V_AND_B32_e64 %1, %3, implicit %exec
    %5 = V_LSHLREV_B32_e64 16, %2, implicit %exec
    %6 = V_OR_B32_e64 %4, %5, implicit %exec

...


More information about the llvm-commits mailing list