[llvm-dev] CFG simplification question, and preservation of branching in the original code
Danila Malyutin via llvm-dev
llvm-dev at lists.llvm.org
Mon Sep 23 07:40:43 PDT 2019
Hi Joan,
One knob you might want to adjust is TargetTransformInfo::getUserCost which is used by ComputeSpeculationCost in SimplifyCFG to determine whether it’s cheap to speculate some instruction or not.
--
Danila
From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of Joan Lluch via llvm-dev
Sent: Saturday, September 21, 2019 15:06
To: llvm-dev <llvm-dev at lists.llvm.org>
Subject: [llvm-dev] CFG simplification question, and preservation of branching in the original code
Hi all,
For my custom architecture, I want to relax the CFG simplification pass, and other passes replacing conditional branches.
I found that the replacement of conditional jumps by “select" or other instructions is often too aggressive, and this causes inefficient code for my target, because in most cases a branch would just be cheaper.
For example, considering the following c code:
long test (long a, long b)
{
int neg = 0;
long res;
if (a < 0)
{
a = -a;
neg = 1;
}
res = a*b;
if (neg)
res = -res;
return res;
}
This code can be obviously simplified in c, but please just consider it as an example to show the point.
The code above gets compiled like this (-Oz flag):
; Function Attrs: minsize norecurse nounwind optsize readnone
define dso_local i32 @test(i32 %a, i32 %b) local_unnamed_addr #0 {
entry:
%cmp = icmp slt i32 %a, 0
%sub = sub nsw i32 0, %a
%a.addr.0 = select i1 %cmp, i32 %sub, i32 %a
%mul = mul nsw i32 %a.addr.0, %b
%sub2 = sub nsw i32 0, %mul
%res.0 = select i1 %cmp, i32 %sub2, i32 %mul
ret i32 %res.0
}
All branching was removed and replaced by ‘select’ instructions. Unfortunately, 32 bit operations are expensive on my architecture and in most cases it would be desirable to just keep the original branches, which are relatively cheap. The case above could be converted back to branching by the backend, but for the general case, this is not always practical and misses other optimisation opportunities.
I tried to set 'phi-node-folding-threshold’ to 1 or even 0, and this definitely improves the situation in many cases, but Clang still creates instances of ‘select’ instructions, which are detrimental to my target. I am unsure about where are they created, as I believe that the simplifycfg pass does not longer create them.
So the question is: Are there any other hooks in clang, or custom code that I can implement, to relax the creation of ’select’ instructions as opposed to preserving branches in the original c code?
Thanks,
John
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190923/8041713e/attachment.html>
More information about the llvm-dev
mailing list