[PATCH] D57504: RFC: Prototype & Roadmap for vector predication in LLVM
Robin Kruppe via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Jan 31 10:27:00 PST 2020
rkruppe added a comment.
I'm not sure what problem you think there might be? Both code sequences do the same thing (same side effects, same final result) as the input IR they matched, right? So that's what justifies them both as valid outputs and the choice is just a matter of codegen quality. You don't even need to appeal to the vp.fadd producing undef in disabled lanes, because in the final result those lanes are zero anyway and that's all that matters. This doesn't seem fundamentally more tricky than any other isel pattern that matches multiple IR instructions to produce a more efficient combined instruction. For example, if the ARM backend selects `add i32 %a, (shl i32 %b, 4)` as `add r0, r0, r1, lsl #4`, it never materializes `shl %b, 4` (not into a register, at least) but the end result is still correct.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D57504/new/
https://reviews.llvm.org/D57504
More information about the llvm-commits
mailing list