[libclc] [libclc] Optimize isfpclass-like CLC builtins (PR #124145)
Fraser Cormack via cfe-commits
cfe-commits at lists.llvm.org
Thu Jan 23 22:56:30 PST 2025
frasercrmck wrote:
> How does using isfpclass avoid scalarization here? I think it's somewhat preferably to use the named operations here, they are subtly different since they canonicalize the input unlike is.fpclass
The builtins we were using before, like `__builtin_isnan`, don't take vector types so we were forced to scalarize.
I actually started looked into adding `__builtin_elementwise_isnan` etc. to clang before realizing that `__builtin_isfpclass(x, 0x3)` accepts vector types and generates the same code as `__builtin_isnan(x)` does for scalar types (and essentially the same for vectors). I don't see any input canonicalization going on before this change.
``` diff
in function _Z5isnanDv2_f:
in block %entry:
> %0 = fcmp uno <2 x float> %a, zeroinitializer
> %sext.i = sext <2 x i1> %0 to <2 x i32>
> ret <2 x i32> %sext.i
< %0 = extractelement <2 x float> %a, i64 0
< %1 = fcmp uno float %0, 0.000000e+00
< %2 = zext i1 %1 to i32
< %vecinit.i = insertelement <2 x i32> poison, i32 %2, i64 0
< %3 = extractelement <2 x float> %a, i64 1
< %4 = fcmp uno float %3, 0.000000e+00
< %5 = zext i1 %4 to i32
< %vecinit2.i = insertelement <2 x i32> %vecinit.i, i32 %5, i64 1
< %cmp.i = icmp ne <2 x i32> %vecinit2.i, zeroinitializer
< %sext.i = sext <2 x i1> %cmp.i to <2 x i32>
< ret <2 x i32> %sext.i
```
https://github.com/llvm/llvm-project/pull/124145
More information about the cfe-commits
mailing list