[llvm] [Reassociate] Distribute multiply over add to enable factorization (PR #178201)
Yunbo Ni via llvm-commits
llvm-commits at lists.llvm.org
Thu Mar 5 23:15:37 PST 2026
cardigan1008 wrote:
Hi @hazarathayya , there is a related miscompilation case:
```llvm
define i16 @test_nsw_flag_propagation(i16 %x, i16 %y) {
%add = add nsw i16 %x, %y
%mul1 = mul nsw i16 %x, 10
%mul2 = mul nsw i16 %add, 20
%add2 = add i16 %mul1, %mul2
ret i16 %add2
}
```
With opt built on this patch, it's transformed into:
```llvm
; ModuleID = 'src.ll'
source_filename = "src.ll"
define i16 @test_nsw_flag_propagation(i16 %x, i16 %y) {
%1 = mul nsw i16 %y, 20
%reass.mul = mul i16 %x, 30
%add2 = add i16 %1, %reass.mul
ret i16 %add2
}
```
Alive2 proof: https://alive2.llvm.org/ce/z/Xv8pBf
> Note: This is a review assisted with a self-built agent. The reproducer was validated manually. Please let me know if anything is wrong.
**Bug Triggering Analysis:**
This case rewrites the expression into `y * 20 + x * 30`, while preserving `nsw` on the newly created `y * 20`.
That is not valid: `nsw` on `(x + y) * 20` does not imply `nsw` on `y * 20`. For example, with `x = 15` and `y = -1653`, we have `(x + y) * 20 = -32760`, which is within the `i16` range, but `y * 20 = -33060`, which signed-overflows. The new `mul nsw` therefore becomes poison, making the target more poisonous than the source.
**Fix Weakness Analysis:**
The weakness in the fix is the assumption that flags can be safely copied during algebraic distribution. The patch includes the following code:
```cpp
if (auto *MulInst = dyn_cast<BinaryOperator>(AC))
MulInst->copyIRFlags(I); // Copy nsw/nuw from original mul
if (auto *MulInst = dyn_cast<BinaryOperator>(BC))
MulInst->copyIRFlags(I);
Value *NewAdd = Builder.CreateAdd(AC, BC);
if (auto *AddInst = dyn_cast<BinaryOperator>(NewAdd))
AddInst->copyIRFlags(Add); // Copy nsw/nuw from original add
```
This logic blindly copies the `nsw` and `nuw` flags without verifying if the intermediate distributed operations are guaranteed not to wrap. As demonstrated by the test case, the intermediate products (`A * C` or `B * C`) can overflow even if the original factored expression `(A + B) * C` does not. By copying the `nsw` flag, the pass introduces undefined behavior (poison) for inputs that were previously valid. The fix should either drop the flags or perform a rigorous analysis to prove that the distributed operations cannot wrap before copying the flags.
Additionally, the patch introduces a secondary bug by using `dbgs() << "DISTRIBUTING: ..."` without wrapping it in `LLVM_DEBUG(...)`. This causes the pass to unconditionally print debug information to standard error during normal compilation, which breaks tools and tests that expect clean output.
https://github.com/llvm/llvm-project/pull/178201
More information about the llvm-commits
mailing list