[PATCH] D57504: RFC: Prototype & Roadmap for vector predication in LLVM

Cameron McInally via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Jan 31 11:31:39 PST 2020


cameron.mcinally added a comment.

In D57504#1852185 <https://reviews.llvm.org/D57504#1852185>, @rkruppe wrote:

> I'm not sure what problem you think there might be? Both code sequences do the same thing (same side effects, same final result) as the input IR they matched, right?


Ah, right. That side effects are the difference. Thanks for reminding me.

> So that's what justifies them both as valid outputs and the choice is just a matter of codegen quality. You don't even need to appeal to the vp.fadd producing undef in disabled lanes, because in the final result those lanes are zero anyway and that's all that matters. This doesn't seem fundamentally more tricky than any other isel pattern that matches multiple IR instructions to produce a more efficient combined instruction. For example, if the ARM backend selects `add i32 %a, (shl i32 %b, 4)` as `add r0, r0, r1, lsl #4`, it never materializes `shl %b, 4` (not into a register, at least) but the end result is still correct.

Yeah, this was what I was hung up on. I didn't see the difference between something like not materializing a dead instruction and masking an inactive element. But, yeah. the side effects would not be the same.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D57504/new/

https://reviews.llvm.org/D57504





More information about the llvm-commits mailing list