<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/146957>146957</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
Rust compilation error on some targets
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
SyxtonPrime
</td>
</tr>
</table>
<pre>
Cross post from: https://github.com/rust-lang/rust/issues/143399
Not sure if this is a Rust or LLVM issue but I think there is a decent chance this is an LLVM issue so wanted to post it here too.
Recently I have been running into some strange errors which seem to only occur on some architectures. See
https://github.com/Plonky3/Plonky3/issues/729
https://github.com/Plonky3/Plonky3/issues/905
For more info about the situation where these arose. I managed to make a minimal example in Godbolt so I thought it made sense to post this here.
### Code
Probably easier to understand in the following Godbolt link: https://godbolt.org/z/rrrj3eobd
Basically we define a simple function which adds integers mod a prime `P`. We then use this on vectors of length `4`. What you find is that if you check `(x + y) + z = x + (y + z)` manually, everything works as expected and they agree. If however we define a function which computes `(x + y) + z`, `x + (y + z)` and then checks that they are equal we get an error. I agree this seems totally bizarre, please check out the Godbolt.
```(rust)
use std::array;
/// The Baby Bear prime: 2^31 - 2^27 + 1.
const P: u32 = 2013265921;
// To help read the assembly, note: 2^32 - P = 2281701375
/// Addition modulo P.
///
/// Inputs are asusmed to be < P.
/// Assuming this, outputs will also be < P.
#[inline(always)]
fn add(lhs: u32, rhs: u32) -> u32 {
let mut sum = lhs + rhs; // Never overflows as inputs are < P < 2^31.
let (corr_sum, over) = sum.overflowing_sub(P);
// over is false if sum >= P and true if sum < P.
if !over {
sum = corr_sum;
}
sum
}
/// Addition modulo P in a degree 4 extension.
///
/// Identical to 4 additions in parallel.
#[unsafe(no_mangle)]
pub fn add_bb_deg_4_ext(lhs: [u32; 4], rhs: [u32; 4]) -> [u32; 4] {
array::from_fn(|i| add(lhs[i], rhs[i]))
}
/// Compute (lhs + mid) + rhs
#[unsafe(no_mangle)]
pub fn add_bb_assoc_l(lhs: [u32; 4], mid: [u32; 4], rhs: [u32; 4]) -> [u32; 4] {
let lhs = add_bb_deg_4_ext(lhs, mid);
add_bb_deg_4_ext(lhs, rhs)
}
/// Compute lhs + (mid + rhs)
#[unsafe(no_mangle)]
pub fn add_bb_assoc_r(lhs: [u32; 4], mid: [u32; 4], rhs: [u32; 4]) -> [u32; 4] {
let rhs = add_bb_deg_4_ext(mid, rhs);
add_bb_deg_4_ext(lhs, rhs)
}
/// Check that (lhs + mid) + rhs = lhs + (mid + rhs)
#[unsafe(no_mangle)]
pub fn check_assoc(lhs: [u32; 4], mid: [u32; 4], rhs: [u32; 4]) {
let assoc_l = add_bb_assoc_l(lhs, mid, rhs);
let assoc_r = add_bb_assoc_r(lhs, mid, rhs);
assert_eq!(assoc_l, assoc_r);
}
fn main() -> () {
let lhs = [252551971, 694974649, 213757600, 1325013984];
let mid = [506156623, 97664653, 1234719014, 1349792299];
let rhs = [1626423134, 1338438783, 786682629, 1311208151];
// Check all elements are < P
assert!(lhs.iter().all(|&x| x < P));
assert!(mid.iter().all(|&x| x < P));
assert!(rhs.iter().all(|&x| x < P));
// Let's manually check that
// (lhs + mid) + rhs = lhs + (mid + rhs)
let assoc_l = add_bb_assoc_l(lhs, mid, rhs);
let assoc_r = add_bb_assoc_r(lhs, mid, rhs);
// Check that (x + y) + z = x + (y + z)
println!("{assoc_l:?}");
println!("{assoc_r:?}");
assert_eq!(assoc_l, assoc_r);
println!("The two assoc's are equal");
// Everything up until here runs.
// Now let's check the same thing using our
// function check_assoc
check_assoc(lhs, mid, rhs);
// This fails with compilation option:
// -O -C target-cpu=znver4 -C opt-level=3
// among others.
}
```
### Meta
<!--
If you're using the stable version of the compiler, you should also check if the
bug also exists in the beta or nightly versions.
-->
The error occurs when compiling with current rustc 1.880, beta and nightly. It seems to have been introduced in the move to `LLVM 20` as these errors do not appear on `nightly-2025-02-17` but things begin failing on `nightly-2025-02-18`.
The error occurs when using the compiler flags: `-O -C target-cpu=znver4 -C opt-level=3`.
It does not occur with `opt-level=0, 1, 2`, it does however occur with some other target cpus in particular `mic_avx512` and `znver5` but not with others such as `skylake, skylake_avx512, alderlake, raptorlake`. (If you want a complete list I can go through and check them all)
### Error output
See the Godbolt link for more details. Essentially what seems to be happening is that something goes wrong in the vectorization code? It's hard to say exactly what though. The compiled code for `check_assoc` seems to be reasonable? We are pretty lost as to what is going on.
</pre>
<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJy8WV1v6zbS_jXMzcCGRFmSdZGLxDl5EaDtG7SL7WVASWOJDUW6JBXH59cvhpT8kY-zpz3FBoEjS5wPPvPMUDMRzslOI16z_Jbld1di9L2x178dXr3Rj1YOeFWb9nC9scY52BnnYWvNwLIb6L3fOZbdMH7P-H0nfT_Wy8YMjN_b0fmFErqbrhm_l86N6Bi_T1dZVlUsuWHJzS_GgxstgtyC76UD6UDAr6PzYCz89NO_f4YgB_Xo4YGW6GfwPZIErWyxQe2h6YVu8KRBn4s6A3uhPbbgTdyA9BBUeGOW0Y9fgx51gAfoxQtCjajBjlpL3YHU3oAzA4LzVugOAa011sG-l00PDnEg1UarA5imGS0YHdcL2_TSY-NHi24JvyFGc99A7lEZ_XzILq6O0JW8-gHpKsmj-XtjYTCEod4aELUZPYEKTvpReGk07CM-PTrahHG4hAcYhBZdhHEQzwgCBqnlIBTgqxh2ivTB_5m2NsoT6hQvM3Z9AHwQLYJD7fAYhhAuMjQFgfEs_sLGtITUozW1qNUBUDiJlgRH3aJ1XuiWjJHTW6OU2VOcZtNK6ucP-BmfLo0lUn4lYlr7R4ambqP5W-FkI5Q6wB6hxa3UtEUnw862o24mZCjmom0d8QI7tA4G04KAHSULsCJ5ZEWyhN8DfhpGNxHTaHjBxhNxzBYU6s73tHwVl_fCw8GMsJW0NweebshtuNf02DzTWsbXr8D4LRwYr8LFV2DZHcSbjK8P8SbjFSsSithIO2J8A_iC9kAJ1MHe2GcHwgG-7rChzCA8fY8HEJ1FCvYWerMnkQsw3qDQmGE3enSfeBbubujhJ-5NVnXc37Tl6IZFwD9Hoch8h55SOiQd8TD4GDGl3HPgjQ9xq-VXYS2S0Z1C4XACbub3RJCZbkUy_fJ1rFGUXBAC5nxLxMluhLXiwLLb8CR-BDYxfg__6hFuRX2AWxQ2hp9ox1n-JUthES54GXacLoNsY7Tz8EirxoyHyPEkzXiRVzw9Wpm0G-hR7cCiCCiBcA6HOgZTG3-yxWEBj1EZX6dlkmZl_pG_N20rQ_QG047KwOPycgF9O61-0LvRuxAJ4UY3xMSvEVi2eScKN86NA3GLwkIemtEH-b1UCoRy70Uzlt9KraRGxtdC7cXBUQjyu_B8qynJGF-r3k14kVp79q2CBcu-RCTLiB39KPQwjHSuDAEU1bsQgyB6O8P7SyC3eUG7VWYfskGedhwcDZ8xmssL7YyvG2PtkxuHsNUXtIH02R0ZXc5Kpe6e3Fgzvn6kjWUnF2cnaCXl-lYoF87A6PMX0vQY08OOZw-O6NGP3ALjaVDByls4KQc47v3oZjTOyrvjMrobAjHd-29koXpLB25IvhXgq0ftpNHLC9q80fLQovZUVYk7K4poUEpYw05YoRSqcz6M2okt8UGbp0HoTuGJEruxnljxVNdPLXZPqyd89SeKkHzGKcYrEjmx5e2DiTiXt2cOTRlPyU8vOk9bzfialRvJys0ZJfNbeWZk_lbNReQTVDexYkJUEmg5yHaumKTp-7B4A4ZwzjRP6ltQkJl_DKNjJoRNZHefBWWye0n-z9eSK9-H3wwe4-tBtkfwZuG_h5_93-NnP8cvAHfE5J_CL5yH4Zz9lIIXRfNHAQ4HcMT3n0X3LZJTCpyjeZkVMxc_gvSkwL5XYL9DAZ3N1j_hn4yndJrNljdw1HFcfxGarYZByFBijpSJ15-lGstvec7zPK3KlAwU1aoqV8Wqoi-czv6ySMKLV5rxPEmzah0we7PdENWoLk-KNC8KevXeQFUWxarIw3XKs1WZVkm6iupWVVlxXlUfqLMn79KCFyuepdkkla1X2bpcB43luijWvOBVfJSmPFmneXqm8Kj1grFCKUCFA-rzA_oN_BF71bul9HQiE4xLoVSs34wXr1TCXyfZWKuj1XP5QbZ_Wf6dE_ZvOPFu6z-hZ7x0x7f46W2Wsvft0h9J5h_NoB9Kno_DPReo7-51Zi07K7VXOgaBcc7K23kPdKLfU-5xfrL_6Xr78fq_muxv9VPL4PcGpoJYulOn897OBMmXU-M27mDUXqo4vrCjdssPUfzF7CkswcBMGgQnhtA2kSJHn2a0byWPHd554Z7XvC_m36qJxy4pvOBKRb2Aj02jVHHMYHb0h6B-I7T4f1hswAvboV80u5Fld1_1C9oV3TY7v1D4gopld9np7RPEYGhTvkc741LewUWnFxu_81HDz-gF3cg2jKeLBUtuHkLPzXhpccIpgOdFrRBe0Lrg-TbcjZuhPN-ERt31ZlRt7Hci8GGwhSy5qccu3sdX6bybxxc1egHGgpZd79VhNhA2sKDzgCU3xJrQ_cbxkoN9aJqD7dDPB1xHa1F7oFa2gXS5XodDIOinTmIysIQHf2ybz4ZdUntr2rHB42BlMC9hVsOKJEzTeBJ6djeNhaYZWGuoGQWx21ETbDQtn0wteMLzRcIXaUmSdejCpe4c1NhJHUgRaPix0JoVydSqfwzAKThzGGCrRBdfGIrkL3AoGnrw0Bp0YT9xjheAZUVyvjoereGwnQYccpKbJyZnsmEKGBg5OQLNbpy7Hy-bUQlLBgbZPImX1zzl81yEFUnwNp-hI6-CzkhwcGPTUzRYkbjngxLPYe4xXc7KqCapFu382IqdN_FbkSypiEa2hwkpiACkQnrBls7DAzRCQ2fA99aMXR8cO9aTAcKxVp2S6jyvvsRwhSkAS25-CxMbvJjRwXaeQ7boqUAs4Ytz1DDGKRwdAkem1gg9kSxOZKdZEcEbC1pHAdhbE8a1wVKctcmvsdI0pkWW3cNDLIq9sGGc4cQB8FU0frYXJ5bLMNuZWNUG4eAsK5LzGlgkF_5ZFM5oKhNk6XcMtX1n0fsDKON8yB0T7UgHnYnkX16111lbZZW4wuu0zNO8zJJifdVfN0VdrLjIqrQqVrxOinxVrFZphqt1ti2K7ZW8pmxJyiTjSbLi5bLEtE7rbZNVVVo2eclWCQ5CqqVSL8PS2O4qTIOv01VR5eWVEjUqF0b_nGvcx4k5nUT53ZW9JqFFPXaOrRIihDup8dIrvA6D-vOCPiXpNP6OlHdXo1XX3xhak9Lpz2JnzR_YvPl_QfD15Zr_JwAA__85UlSB">