[PATCH] D145634: [X86] Support llvm.{min,max}imum.f{16,32,64}

Mon Apr 24 19:37:34 PDT 2023

e-kud marked an inline comment as done.
e-kud added a comment.

In D145634#4291019 <https://reviews.llvm.org/D145634#4291019>, @skatkov wrote:

> I have a side question (not delaying landing this one)
>
> It looks like this change lowers vectorized form of intrinsic in a non-optimal way.
> So I wonder whether there are some plans to improve it as follow-up?

Yes, for sure. I've tried to find a way to implement vectorized version for SSE, but nothing's come to my mind. We need at least SSE2 for `PCMPEQ{W,D}` because all fp comparison instructions treat `-0.0` and `0.0` as equal. It seems that pentium3 will suffer from not optimal fmaximum/fminimum...

================
Comment at: llvm/test/CodeGen/X86/half.ll:957

-define dso_local void @brcond(half %0) {
+define dso_local void @brcond(half %0) #0 {
 ; CHECK-LIBCALL-LABEL: brcond:
----------------
RKSimon wrote:
> pre-commit the nounwind change to keep it separate from this patch - you shouldn't really need dso_local either
I haven't got commit access yet, here it is https://reviews.llvm.org/D149114

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D145634/new/

https://reviews.llvm.org/D145634