[all-commits] [llvm/llvm-project] cd1e6a: [SROA] Propagate no-signed-zeros(nsz) fast-math fl...
Yashwant Singh via All-commits
all-commits at lists.llvm.org
Mon Jul 1 23:30:00 PDT 2024
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: cd1e6a587be6352f63f180b1ff5e0a348a8da444
https://github.com/llvm/llvm-project/commit/cd1e6a587be6352f63f180b1ff5e0a348a8da444
Author: Yashwant Singh <yashwants at nvidia.com>
Date: 2024-07-02 (Tue, 02 Jul 2024)
Changed paths:
M llvm/lib/Transforms/Utils/PromoteMemoryToRegister.cpp
A llvm/test/Transforms/PhaseOrdering/generate-fabs.ll
A llvm/test/Transforms/SROA/propagate-fast-math-flags-on-phi.ll
Log Message:
-----------
[SROA] Propagate no-signed-zeros(nsz) fast-math flag on the phi node using function attribute (#83381)
Its expected that the sequence `return X > 0.0 ? X : -X`, compiled with
-Ofast, produces fabs intrinsic. However, at this point, LLVM is unable
to do so.
The above sequence goes through the following transformation during the
pass pipeline:
1) SROA pass generates the phi node. Here, it does not infer the
fast-math flags on the phi node unlike clang frontend typically does.
2) Phi node eventually gets translated into select instruction.
Because of missing no-signed-zeros(nsz) fast-math flag on the select
instruction, InstCombine pass fails to fold the sequence into fabs
intrinsic.
This patch, as a part of SROA, tries to propagate nsz fast-math flag on
the phi node using function attribute enabling this folding.
Closes #51601
Co-authored-by: Sushant Gokhale <sgokhale at nvidia.com>
To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications
More information about the All-commits
mailing list