[all-commits] [llvm/llvm-project] 839f1e: [X86][SDAG] Improve the lowering of `s|uitofp i8|i...

Thu Nov 2 13:25:51 PDT 2023

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 839f1e40b18f9cb08ebd1d19233f79cf1c5a4309
      https://github.com/llvm/llvm-project/commit/839f1e40b18f9cb08ebd1d19233f79cf1c5a4309
  Author: qcolombet <quentin.colombet at gmail.com>
  Date:   2023-11-02 (Thu, 02 Nov 2023)

  Changed paths:
    M llvm/lib/Target/X86/X86ISelLowering.cpp
    M llvm/test/CodeGen/X86/fold-int-pow2-with-fmul-or-fdiv.ll
    A llvm/test/CodeGen/X86/select-narrow-int-to-fp.ll

  Log Message:
  -----------
  [X86][SDAG] Improve the lowering of `s|uitofp i8|i16 to half` (#70834)

Prior to this patch, vector `s|uitofp` from narrow types (`<= i16`) were
scalarized when the hardware doesn't support fp16 conversions natively.
This patch fixes that by avoiding using `i16` as an intermediate type
when there is no hardware support conversion from this type to half. In
other words, when the target doesn't support `avx512fp16`, we avoid
using intermediate `i16` vectors for `s|uitofp` conversions.

Instead we extend the narrow type to `i32`, which will be converted to
`float` and downcasted to `half`.
Put differently, we go from:
```
s|uitofp iNarrow %src to half
```
To
```
%tmp = s|zext iNarrow %src to i32
%tmpfp = s|uitofp i32 %tmp to float
fptrunc float %tmpfp to half
```

Note that this patch:
- Doesn't change the actual lowering of i32 to half. I.e., the `float`
intermediate step and the final downcasting are what existed for this
input type to half.
- Changes only the intermediate type for the lowering of `s|uitofp`.
I.e., the first `s|zext` from i16 to i32.

Remark: The vector and scalar lowering of `s|uitofp` don't use the same
code path. Not super happy about that, but I'm not planning to fix that,
at least in this PR.

This fixes https://github.com/llvm/llvm-project/issues/67080