[all-commits] [llvm/llvm-project] 70e78b: AMDGPU: Custom lower fptrunc vectors for f32 -> f1...

Fri Jun 6 15:15:45 PDT 2025

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 70e78be7dc3e060457d121e4ef9ee2745bb6c41e
      https://github.com/llvm/llvm-project/commit/70e78be7dc3e060457d121e4ef9ee2745bb6c41e
  Author: Changpeng Fang <changpeng.fang at amd.com>
  Date:   2025-06-06 (Fri, 06 Jun 2025)

  Changed paths:
    M llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
    M llvm/lib/Target/AMDGPU/SIISelLowering.cpp
    M llvm/lib/Target/AMDGPU/SIISelLowering.h
    M llvm/test/CodeGen/AMDGPU/fptrunc.v2f16.no.fast.math.ll

  Log Message:
  -----------
  AMDGPU: Custom lower fptrunc vectors for f32 -> f16 (#141883)

The latest asics support v_cvt_pk_f16_f32 instruction. However current
implementation of vector fptrunc lowering fully scalarizes the vectors,
and the scalar conversions may not always be combined to generate the
packed one.
We made v2f32 -> v2f16 legal in
https://github.com/llvm/llvm-project/pull/139956. This work is an
extension to handle wider vectors. Instead of fully scalarization, we
split the vector to packs (v2f32 -> v2f16) to ensure the packed
conversion can always been generated.

To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications