[llvm-dev] Instrumented BB in PGO
Toshio Suganuma via llvm-dev
llvm-dev at lists.llvm.org
Mon Mar 21 22:24:07 PDT 2016
Hi David,
Thank you.
I just submitted a bug report 27024 (PGO instrumentation profile data is
not reflected in correct basic blocks).
Thank you,
--Toshio
From: Xinliang David Li <xinliangli at gmail.com>
To: Toshio Suganuma/Japan/IBM at IBMJP
Cc: llvm-dev <llvm-dev at lists.llvm.org>, Rong Xu <xur at google.com>
Date: 2016/03/22 12:04
Subject: Re: [llvm-dev] Instrumented BB in PGO
On Mon, Mar 21, 2016 at 7:19 PM, Toshio Suganuma via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
Hello,
I have a question regarding PGO instrumented BBs (I use IR-level
instrumentation).
It seems that instrumented BBs do not match between the two compilations
for profile-gen and profile-use for some cases. Here is an example from
SPECcpu 2006 lbm (a simple case consisting of just two modules).
In the first compilation, we have 5 instrumentation points for the main
function as follows:
$ opt -pgo-instr-gen -instrprof _all_combined.bc -o _all_combined_inst.bc
-debug-only=pgo-instrumentation
Dump Function main Hash: 61483163021 after CFGMST
Number of Basic Blocks: 10
BB: FakeNode Index=0
BB: if.then Index=5
BB: for.body Index=4
BB: for.body.lr.ph Index=3
BB: entry Index=1
BB: for.inc Index=8
BB: if.then5 Index=7
BB: if.end Index=6
BB: for.end Index=2
BB: for.end.loopexit Index=9
Number of Edges: 14 (*: Instrument, C: CriticalEdge, -: Removed)
Edge 0: 8-->4 c W=247031
Edge 1: 6-->8 c W=159375
Edge 2: 4-->6 *c W=127500
Edge 3: 1-->2 c W=4500
Edge 4: 4-->5 W=127
Edge 5: 5-->6 * W=127
Edge 6: 6-->7 W=95
Edge 7: 7-->8 * W=95
Edge 8: 0-->1 W=12
Edge 9: 2-->0 * W=12
Edge 10: 3-->4 W=8
Edge 11: 9-->2 W=8
Edge 12: 1-->3 W=7
Edge 13: 8-->9 * W=7
Split critical edge: 4 --> 6
Adding Instrumentation in BB Name=for.body.if.end_crit_edge
Adding Instrumentation in BB Name=if.then
Adding Instrumentation in BB Name=if.then5
Adding Instrumentation in BB Name=for.end
Adding Instrumentation in BB Name=for.end.loopexit
After a training run, we get profile data for the main function as
follows, but these count values are put into incorrect BBs in the second
compilation.
Block counts: [0, 300, 4, 1, 1]
$ opt -analyze -pgo-instr-use _all_combined.bc
-debug-only=pgo-instrumentation
Dump Function main Hash: 61483163021 after CFGMST
Number of Basic Blocks: 10
BB: FakeNode Index=0
BB: for.body.lr.ph Index=3
BB: if.end Index=6
BB: entry Index=1
BB: if.then Index=5
BB: for.body Index=4
BB: for.end.loopexit Index=9
BB: for.inc Index=8
BB: if.then5 Index=7
BB: for.end Index=2
Number of Edges: 14 (*: Instrument, C: CriticalEdge, -: Removed)
Edge 0: 8-->4 c W=247031
Edge 1: 6-->8 c W=159375
Edge 2: 4-->6 *c W=127500
Edge 3: 1-->2 c W=127058
Edge 4: 0-->1 W=135
Edge 5: 2-->0 * W=135
Edge 6: 4-->5 W=127
Edge 7: 5-->6 * W=127
Edge 8: 6-->7 W=95
Edge 9: 7-->8 * W=95
Edge 10: 3-->4 W=8
Edge 11: 9-->2 W=8
Edge 12: 1-->3 W=7
Edge 13: 8-->9 * W=7
5 counts
0: 0
1: 300
2: 4
3: 1
4: 1
SUM = 306
Split critical edge: 4 --> 6
Setting BB Name=for.body.if.end_crit_edge with CountValue=0
Setting BB Name=for.end with CountValue=300
Setting BB Name=if.then with CountValue=4
Setting BB Name=if.then5 with CountValue=1
Setting BB Name=for.end.loopexit with CountValue=1
The CountValue 300 should go to the BB=if.then (Index 5), not for.end
(Index 2). Actually because of this incorrect setting, the entry count of
the main function is set 300, instead of 1 (after populating the count
values).
The reason for this problem is that CFGMST edges are ordered in a
different way due to different weight values (edges 0 --> 1 and 2 --> 0
get W=12 in the first compilation, while they get W=135 in the second
compilation). The weight values are computed based on block frequency
info and branch probability info, but somehow they produce different
values between the two compilations.
Different BFI produced for otherwise identical compilation is a bug we
should fix (can cause other problems too). Can you file a bug about it?
thanks,
David
How can we assume that CFGMST is constructed in the same way between the
two compilations so that we can always set profile results into correct
basic blocks?
Thank you,
--Toshjio
_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160322/d7621f33/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160322/d7621f33/attachment.gif>
More information about the llvm-dev
mailing list