<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/129441>129441</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
wasm: `__builtin_reduce_and` does not optimize well
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
folkertdev
</td>
</tr>
</table>
<pre>
given this C code
https://godbolt.org/z/YMo1qqccT
```c
#include <stdbool.h>
#include <wasm_simd128.h>
bool foo(v128_t a) { return wasm_i8x16_all_true(a); }
bool bar(v128_t a) {
v128_t zero = wasm_i8x16_splat(0);
return __builtin_reduce_and(wasm_i8x16_ne(a, zero));
}
bool baz(v128_t a) {
v128_t zero = wasm_i8x16_splat(0);
return __builtin_reduce_and((a != zero));
}
```
I'd expect these all to optimize to
```asm
foo:
local.get 0
i8x16.all_true
end_function
```
or some variation in it. However, the other variants optimize much worse.
```asm
bar:
local.get 0
v128.const 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
i8x16.ne
local.tee 0
local.get 0
local.get 0
i8x16.shuffle 8, 9, 10, 11, 12, 13, 14, 15, 0, 1, 2, 3, 0, 1, 2, 3
v128.and
local.tee 0
local.get 0
local.get 0
i8x16.shuffle 4, 5, 6, 7, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3
v128.and
i32x4.extract_lane 0
i32.const 0
i32.ne
end_function
baz:
local.get 0
v128.const 0, 0, 0, 0
i32x4.eq
v128.any_true
i32.const -1
i32.xor
i32.const 1
i32.and
end_function
```
Binary size is especially important for wasm, and it looks like `__builtin_reduce_and` just does not optimize well (I suspect the same is true for `__builtin_reduce_or).
s390x has the same limitation https://github.com/llvm/llvm-project/issues/129434, so maybe some work can be shared between backends?
This came up while working on the rust standard library, which would rather use the generic implementation of operations than a target-specific one.
</pre>
<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJysls2O4jgQgJ_GXEoTOXYIcMiB7l60c9jbXPaEHKcgnnZs2nb46adf2YGmFzJsSzsRKhSnqvzVjysR3qutQazI9IlMXyaiD6111cbqV3Shwf2kts2p2qo9Ggit8vAM0jZI6JLQZRvCzhO-JGxF2Gprm9rqkFm3JWz1Ttjq779s_vYm5Y9BnZR0-Ml4w7gyUvcNAuHPPjS1tTprCf_j7uFB-G7tVdfkbP6hQZfRADbWEjbf52y-DiAIWwCZPYHD0DsDyVDNj3m5Flqvg-uRsHnUIvwJyOzlk6NauHtHhC4BAM6r7-gsEP7y2a_faREIm9PB6dngvP96XfdKB2XWDpte4lqYhrD5J3NzBnpOzqOPi5tbuvffRPcYLdIAYXn08wuiSxEHuu-EzRrA4w5lgNCiRxBaQ7Bgd0F16h0h2JvyC98RuoyV48tzDPHSVgqdbTGc7-MOkOLIPqp31UbTrDe9kUFZc49lHXjbIeyFUyKqgDKgQgZ_2gPu0cWMhxbBhhbdoGWCvzJ3vWzhYJ3HbBQ-dssAP0p9vmJlMmmND5dn7Pm3ies2Q44M3uUyII5QXYnHV-8thg182282OnqcR4JFFHmCyfMkWZI8ySLJ6Qdv0kgKfHTtJmmxG8ei-Srxg166hpEYE2IZxezLsL8IYIRccXYsMjwGJ2RYa2FwJLucfW6SMytnWVR-0PGxC9-_cIT-swvvcN_uy3H6OH-3uN_ym1iO1sGD8PKLE2GaR-H9-0A_KSPcCXw8nMoD-h1KJbQ-gep21gVhAmysS8MvRhWdqwDa2lcPWr0ikJKODr2Sws_eB2gsejA2XIfAAbUGwubfwff-MuDAiy4hxHykLcccW0fYIoOB3fMFPUIr_NVeq06FYTDdvERVaPs6k7YjbKX1_vL3befsT5SBsJXyvkdP2Cpni4KnJvYWOnGqcRh6B-teQQoDcaEVDhuoMRwQDdRCvqJpPOGrM9yP-FKXkanfwaFVerBXZgvWJGAXs-ODMI1wDWhVO-FOcddDq9KU7HUDTqRJ2ntMNls06JSMxdHYoTnHajdgd-jSTcyGMCAgCLfF8C1VdKMkWJOG7qSpeLPgCzHBKp8VdMEpLdmkrcr5rCj5ZlE3osyn00UuczYvOJZ1OS0wlxNVMcqmlFOWT4ucsqymJZ0VQhYFxZLyKSkodkLpLGY2fq9MUk6rmNAin2hRo_bpk4gxgwdITwlj8QvJVakcdb_1pKBa-eCvboIKGqvUg3z5qOPGm23SO139v34o8sm-Yv8EAAD__0Tv1sE">