[PATCH] D55570: [AMDGPU] Improve SDWA generation for V_OR_B32_E32.
Ron Lieberman via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Dec 11 12:33:25 PST 2018
ronlieb created this revision.
ronlieb added reviewers: rampitec, arsenm, AMDGPU.
Herald added subscribers: llvm-commits, t-tye, tpr, dstuttard, yaxunl, nhaehnle, wdng, jvesely, kzhuravl.
Add missing patterns for V_OR_B32_SDWA:
WORD_1, BYTE_3, BYTE_2, BYTE_1
Previously we only recognized WORD_0 and BYTE_0.
Transform:
%13:vgpr_32 = GLOBAL_LOAD_DWORD %2, 0, 0, 0, implicit $exec ::
(volatile load 4, addrspace 1)
%14:sreg_32_xm0 = S_MOV_B32 65280
%15:vgpr_32 = V_AND_B32_e64 %13, killed %14, implicit $exec
%16:vgpr_32 = V_OR_B32_e64 killed %15, killed %13, implicit $exec
Into
%6:vgpr_32 = GLOBAL_LOAD_DWORD %1, 0, 0, 0, implicit $exec ::
(volatile load 4, addrspace 1)
%9:vgpr_32 = V_OR_B32_sdwa 0, %6, 0, killed %6, 0, 6, 0, 1, 6, implicit $exec
A subsequent set of patches will address the XOR and AND pattern improvements.
Repository:
rL LLVM
https://reviews.llvm.org/D55570
Files:
lib/Target/AMDGPU/SIPeepholeSDWA.cpp
test/CodeGen/AMDGPU/add.v2i16.ll
test/CodeGen/AMDGPU/insert_vector_elt.v2i16.ll
test/CodeGen/AMDGPU/load-lo16.ll
test/CodeGen/AMDGPU/sdwa-andops.mir
test/CodeGen/AMDGPU/sdwa-ors.mir
test/CodeGen/AMDGPU/sdwa-xors-ands-ors.ll
test/CodeGen/AMDGPU/sub.v2i16.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D55570.177757.patch
Type: text/x-patch
Size: 22301 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20181211/96ab5041/attachment-0001.bin>
More information about the llvm-commits
mailing list