[PATCH] D17614: AMDGPU/SI: Implement DS_PERMUTE/DS_BPERMUTE Instruction Definitions and Intrinsics.
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Thu Feb 25 11:13:14 PST 2016
arsenm added inline comments.
================
Comment at: lib/Target/AMDGPU/SIInstrInfo.cpp:230-231
@@ -229,1 +229,4 @@
+ // DS_PERMUTE has no offset0 and offset1.
+ if (!Offset0Imm || !Offset1Imm)
+ return false;
----------------
Checking just offset0 should be sufficient, and can be moved above
================
Comment at: lib/Target/AMDGPU/VIInstructions.td:131
@@ +130,3 @@
+
+let Uses = [EXEC] in {
+defm DS_PERMUTE_B32 : DS_1A1D_PERMUTE < 0x3e, "ds_permute_b32", VGPR_32>;
----------------
Are we sure these don't real M0?
================
Comment at: test/CodeGen/AMDGPU/llvm.amdgcn.ds.permute.ll:3-4
@@ +2,4 @@
+
+declare i32 @llvm.amdgcn.ds.permute(i32, i32) convergent
+declare i32 @llvm.amdgcn.ds.bpermute(i32, i32) convergent
+
----------------
I would prefer splitting the 2 separate intrinsics into separate patches. These are also missing the readnone (which shoulda also use attribute groups)
http://reviews.llvm.org/D17614
More information about the llvm-commits
mailing list