[PATCH] D55444: AMDGPU: Fix DPP combiner
Connor Abbott via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Jan 14 10:27:47 PST 2019
cwabbott added inline comments.
================
Comment at: lib/Target/AMDGPU/GCNDPPCombine.cpp:369
+
+ if (BoundCtrlZero && MaskAllLanes) { // [1]
+ OldOpndVGPR.Reg = AMDGPU::NoRegister;
----------------
I don't think this is correct. A lane can be set to 0 for the shift DPP controls (e.g. `DPP_ROW_SL1`) or if EXEC is zero for the lane to be read, in which case the user will do something non-trivial (e.g. OR will pass through the other operand) which we can't emulate in general.
================
Comment at: lib/Target/AMDGPU/GCNDPPCombine.cpp:385
+ assert(!BoundCtrlZero); // by check [1]
+ // undefine old to make regalloc's life easier but this requires
+ // bound_ctrl:0 set
----------------
Indentation is off here, seems like your editor is using hard tabs
================
Comment at: lib/Target/AMDGPU/GCNDPPCombine.cpp:391
+ } else {
+ // bound_ctrl value ins't important here but this makes the combiner's
+ // logic easier - we check the VALU op for identity value when
----------------
Also here
================
Comment at: lib/Target/AMDGPU/GCNDPPCombine.cpp:412
+ }
+ // case when DPP mov old == DPP scr register
+ OldOpndVGPR.Reg = AMDGPU::NoRegister;
----------------
This seems incorrect to me. In the lanes where the source is invalid or the row/bank mask is 0, the DPP move will act as a no-op, here so if that lane is then added, or'd, etc. with something else, we can't emulate that with a single instruction.
Repository:
rL LLVM
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D55444/new/
https://reviews.llvm.org/D55444
More information about the llvm-commits
mailing list