[PATCH] D55570: [AMDGPU] Improve SDWA generation for V_OR_B32_E32.

Ron Lieberman via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Dec 11 12:33:25 PST 2018


ronlieb created this revision.
ronlieb added reviewers: rampitec, arsenm, AMDGPU.
Herald added subscribers: llvm-commits, t-tye, tpr, dstuttard, yaxunl, nhaehnle, wdng, jvesely, kzhuravl.

Add missing patterns for V_OR_B32_SDWA:

  WORD_1, BYTE_3, BYTE_2, BYTE_1

Previously we only recognized WORD_0 and BYTE_0.

Transform:

  %13:vgpr_32 = GLOBAL_LOAD_DWORD %2, 0, 0, 0, implicit $exec ::
                (volatile load 4, addrspace 1)
  %14:sreg_32_xm0 = S_MOV_B32 65280
  %15:vgpr_32 = V_AND_B32_e64 %13, killed %14, implicit $exec
  %16:vgpr_32 = V_OR_B32_e64 killed %15, killed %13, implicit $exec

Into

  %6:vgpr_32 = GLOBAL_LOAD_DWORD %1, 0, 0, 0, implicit $exec ::
               (volatile load 4, addrspace 1)
  %9:vgpr_32 = V_OR_B32_sdwa 0, %6, 0, killed %6, 0, 6, 0, 1, 6, implicit $exec

A subsequent set of patches will address the XOR and AND pattern improvements.


Repository:
  rL LLVM

https://reviews.llvm.org/D55570

Files:
  lib/Target/AMDGPU/SIPeepholeSDWA.cpp
  test/CodeGen/AMDGPU/add.v2i16.ll
  test/CodeGen/AMDGPU/insert_vector_elt.v2i16.ll
  test/CodeGen/AMDGPU/load-lo16.ll
  test/CodeGen/AMDGPU/sdwa-andops.mir
  test/CodeGen/AMDGPU/sdwa-ors.mir
  test/CodeGen/AMDGPU/sdwa-xors-ands-ors.ll
  test/CodeGen/AMDGPU/sub.v2i16.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D55570.177757.patch
Type: text/x-patch
Size: 22301 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20181211/96ab5041/attachment-0001.bin>


More information about the llvm-commits mailing list