[all-commits] [llvm/llvm-project] 70e78b: AMDGPU: Custom lower fptrunc vectors for f32 -> f1...
Changpeng Fang via All-commits
all-commits at lists.llvm.org
Fri Jun 6 15:15:45 PDT 2025
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 70e78be7dc3e060457d121e4ef9ee2745bb6c41e
https://github.com/llvm/llvm-project/commit/70e78be7dc3e060457d121e4ef9ee2745bb6c41e
Author: Changpeng Fang <changpeng.fang at amd.com>
Date: 2025-06-06 (Fri, 06 Jun 2025)
Changed paths:
M llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
M llvm/lib/Target/AMDGPU/SIISelLowering.cpp
M llvm/lib/Target/AMDGPU/SIISelLowering.h
M llvm/test/CodeGen/AMDGPU/fptrunc.v2f16.no.fast.math.ll
Log Message:
-----------
AMDGPU: Custom lower fptrunc vectors for f32 -> f16 (#141883)
The latest asics support v_cvt_pk_f16_f32 instruction. However current
implementation of vector fptrunc lowering fully scalarizes the vectors,
and the scalar conversions may not always be combined to generate the
packed one.
We made v2f32 -> v2f16 legal in
https://github.com/llvm/llvm-project/pull/139956. This work is an
extension to handle wider vectors. Instead of fully scalarization, we
split the vector to packs (v2f32 -> v2f16) to ensure the packed
conversion can always been generated.
To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications
More information about the All-commits
mailing list