[all-commits] [llvm/llvm-project] 3d17c9: [RISCV] Fix missing vsetvli in transparent block case
Philip Reames via All-commits
all-commits at lists.llvm.org
Mon May 16 17:06:52 PDT 2022
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 3d17c917099a691e12f05d2502d81b972600a4ae
https://github.com/llvm/llvm-project/commit/3d17c917099a691e12f05d2502d81b972600a4ae
Author: Philip Reames <preames at rivosinc.com>
Date: 2022-05-16 (Mon, 16 May 2022)
Changed paths:
M llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
M llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-crossbb.ll
M llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-crossbb.mir
Log Message:
-----------
[RISCV] Fix missing vsetvli in transparent block case
We've got a lurking problem with our data flow implementation where different phases disagree, resulting in possible miscompiles. D119518 introduced a workaround, but failed to consider blocks which only contain load/stores compatible with their incoming state.
When I went to rebase and simplify D125232, it turned out that not all of the correctness issues had been fixed yet after all. This is the correctness fix accidentally embedded in the original more complicated version.
Note that the test changes here are mostly regressions. It's worth noting that the simplified version of D125232 exactly reverses all the non-functional diffs in the test caused here. D125232 should be the immediate following commit.
Differential Revision: https://reviews.llvm.org/D125703
Commit: 1474880353f12a5a1beb62fc873477a8a2e6afda
https://github.com/llvm/llvm-project/commit/1474880353f12a5a1beb62fc873477a8a2e6afda
Author: Philip Reames <preames at rivosinc.com>
Date: 2022-05-16 (Mon, 16 May 2022)
Changed paths:
M llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
M llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-crossbb.ll
M llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-crossbb.mir
Log Message:
-----------
[RISCV] Use classic dataflow for VSETVLI insertion
Our current implementation of the InsertVSETVLI dataflow allows phase 3 to arrive at a different block end state than the data flow in phase 1/2 computed. This arises because a block which contains instructions (e.g. load or stores) which don't consume all the incoming bits of the VL/VTYPE can be compatible with multiple incoming states. The algorithm effectively changes the SEW on such instructions, and propagates the prior state forward. As phase 3 uses the block input state for this propagation, but phase 1/2 doesn't, this can result in different block end states.
If we don't correct for it, this discrepancy can result in miscompiles. This was the source of multiple recent bugs. However, by now we have fixes for all known correctness issues.
The basic strategy we use is to insert a compensation vsetvli to bring the block state leaving the block back into consistency with the one computed. This is correct, but results in extra vsetvlis being placed at the end of blocks.
This change adjusts the phase 1/2 algorithm to propagate the incoming block state through the block, allowing the compatibility rules to modify the end state. The algorithm may need to run slightly more iterations, but the end result is consistent with what phase 3 does.
The benefit of doing this is two fold.
First, we reverse some of the code quality introductions introduced in the functional fixes.
Second, we simplify the invariants, and allow the strict assertions to be enabled. Several humans, myself included, have found it quite surprising that invariant didn't hold already, and arguably that confusion is the cause of several of our recent miscompiles in this code.
The downside to this patch is that the dataflow may require additional iterations to stabilize. In the worse case, we go from O(Edges) to O(E + UniquePaths) as the incoming state (and thus the outgoing one) can now change once for each path from the entry block.
Differential Revision: https://reviews.llvm.org/D125232
Compare: https://github.com/llvm/llvm-project/compare/f20e6a6e61da...1474880353f1
More information about the All-commits
mailing list