[llvm-dev] fcmp missed optimization opportunities?
Mueller-Roemer, Johannes Sebastian via llvm-dev
llvm-dev at lists.llvm.org
Mon Feb 22 06:44:01 PST 2016
Do the fast math flags (esp. nnan) currently have any effect on fcmp instruction optimization?
This small module:
target datalayout = "e-m:w-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-pc-windows-msvc"
define float @test(float %a, float %b) {
entry:
%0 = fcmp nnan olt float 0.000000e+00, %a
%1 = fcmp nnan olt float %a, %b
%2 = and i1 %0, %1
br i1 %2, label %entry.t0, label %entry.f0
entry.t0: ; preds = %entry
ret float %a
entry.f0: ; preds = %entry
%3 = fcmp nnan oge float 0.000000e+00, %a
br i1 %3, label %entry.t1, label %entry.f1
entry.t1: ; preds = %entry.f0
ret float 0.000000e+00
entry.f1: ; preds = %entry.f0
ret float %b
}
Currently optimizes to this:
target datalayout = "e-m:w-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-pc-windows-msvc"
; Function Attrs: norecurse nounwind readnone
define float @test(float %a, float %b) #0 {
entry:
%0 = fcmp nnan ogt float %a, 0.000000e+00
%1 = fcmp nnan olt float %a, %b
%2 = and i1 %0, %1
%3 = fcmp nnan ole float %a, 0.000000e+00
%.b = select i1 %3, float 0.000000e+00, float %b
%merge = select i1 %2, float %a, float %.b
ret float %merge
}
attributes #0 = { norecurse nounwind readnone }
If nans cannot occur, fcmp nnan ole float %a, 0 (or ule, which should be identical without nans) should just be not fcmp nnan ogt float %a, 0, which was already computed, or am I missing something?
--
Johannes S. Mueller-Roemer, MSc
Wiss. Mitarbeiter - Interactive Engineering Technologies (IET)
Fraunhofer-Institut für Graphische Datenverarbeitung IGD
Fraunhoferstr. 5 | 64283 Darmstadt | Germany
Tel +49 6151 155-606 | Fax +49 6151 155-139
johannes.mueller-roemer at igd.fraunhofer.de | www.igd.fraunhofer.de
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160222/d2e0f5d6/attachment.html>
More information about the llvm-dev
mailing list