[all-commits] [llvm/llvm-project] cd1e6a: [SROA] Propagate no-signed-zeros(nsz) fast-math fl...

Yashwant Singh via All-commits all-commits at lists.llvm.org
Mon Jul 1 23:30:00 PDT 2024


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: cd1e6a587be6352f63f180b1ff5e0a348a8da444
      https://github.com/llvm/llvm-project/commit/cd1e6a587be6352f63f180b1ff5e0a348a8da444
  Author: Yashwant Singh <yashwants at nvidia.com>
  Date:   2024-07-02 (Tue, 02 Jul 2024)

  Changed paths:
    M llvm/lib/Transforms/Utils/PromoteMemoryToRegister.cpp
    A llvm/test/Transforms/PhaseOrdering/generate-fabs.ll
    A llvm/test/Transforms/SROA/propagate-fast-math-flags-on-phi.ll

  Log Message:
  -----------
  [SROA] Propagate no-signed-zeros(nsz) fast-math flag on the phi node using function attribute (#83381)

Its expected that the sequence `return X > 0.0 ? X : -X`, compiled with
-Ofast, produces fabs intrinsic. However, at this point, LLVM is unable
to do so.

The above sequence goes through the following transformation during the
pass pipeline:
1) SROA pass generates the phi node. Here, it does not infer the
fast-math flags on the phi node unlike clang frontend typically does.
2) Phi node eventually gets translated into select instruction. 
Because of missing no-signed-zeros(nsz) fast-math flag on the select
instruction, InstCombine pass fails to fold the sequence into fabs
intrinsic.

This patch, as a part of SROA, tries to propagate nsz fast-math flag on
the phi node using function attribute enabling this folding.

Closes #51601

Co-authored-by: Sushant Gokhale <sgokhale at nvidia.com>



To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications


More information about the All-commits mailing list