[llvm-dev] Instrumented BB in PGO

Xinliang David Li via llvm-dev llvm-dev at lists.llvm.org
Mon Mar 21 22:27:48 PDT 2016


thank you. I have assigned the bug to xur at .

David

On Mon, Mar 21, 2016 at 10:24 PM, Toshio Suganuma <SUGANUMA at jp.ibm.com>
wrote:

> Hi David,
>
> Thank you.
> I just submitted a bug report 27024 (PGO instrumentation profile data is
> not reflected in correct basic blocks).
>
> Thank you,
> --Toshio
>
> [image: Inactive hide details for Xinliang David Li ---2016/03/22
> 12:04:10---On Mon, Mar 21, 2016 at 7:19 PM, Toshio Suganuma via llvm-]Xinliang
> David Li ---2016/03/22 12:04:10---On Mon, Mar 21, 2016 at 7:19 PM, Toshio
> Suganuma via llvm-dev < llvm-dev at lists.llvm.org> wrote:
>
> From: Xinliang David Li <xinliangli at gmail.com>
> To: Toshio Suganuma/Japan/IBM at IBMJP
> Cc: llvm-dev <llvm-dev at lists.llvm.org>, Rong Xu <xur at google.com>
> Date: 2016/03/22 12:04
> Subject: Re: [llvm-dev] Instrumented BB in PGO
> ------------------------------
>
>
>
>
>
> On Mon, Mar 21, 2016 at 7:19 PM, Toshio Suganuma via llvm-dev <
> *llvm-dev at lists.llvm.org* <llvm-dev at lists.llvm.org>> wrote:
>
>    Hello,
>
>    I have a question regarding PGO instrumented BBs (I use IR-level
>    instrumentation).
>
>    It seems that instrumented BBs do not match between the two
>    compilations for profile-gen and profile-use for some cases. Here is an
>    example from SPECcpu 2006 lbm (a simple case consisting of just two
>    modules).
>    In the first compilation, we have 5 instrumentation points for the
>    main function as follows:
>
>    $ opt -pgo-instr-gen -instrprof _all_combined.bc -o
>    _all_combined_inst.bc -debug-only=pgo-instrumentation
>    Dump Function main Hash: 61483163021 after CFGMST
>    Number of Basic Blocks: 10
>    BB: FakeNode Index=0
>    BB: if.then Index=5
>    BB: for.body Index=4
>    BB: *for.body.lr.ph* <http://for.body.lr.ph/> Index=3
>    BB: entry Index=1
>    BB: for.inc Index=8
>    BB: if.then5 Index=7
>    BB: if.end Index=6
>    BB: for.end Index=2
>    BB: for.end.loopexit Index=9
>    Number of Edges: 14 (*: Instrument, C: CriticalEdge, -: Removed)
>    Edge 0: 8-->4 c W=247031
>    Edge 1: 6-->8 c W=159375
>    Edge 2: 4-->6 *c W=127500
>    Edge 3: 1-->2 c W=4500
>    Edge 4: 4-->5 W=127
>    Edge 5: 5-->6 * W=127
>    Edge 6: 6-->7 W=95
>    Edge 7: 7-->8 * W=95
>    Edge 8: 0-->1 W=12
>    Edge 9: 2-->0 * W=12
>    Edge 10: 3-->4 W=8
>    Edge 11: 9-->2 W=8
>    Edge 12: 1-->3 W=7
>    Edge 13: 8-->9 * W=7
>    Split critical edge: 4 --> 6
>    Adding Instrumentation in BB Name=for.body.if.end_crit_edge
>    Adding Instrumentation in BB Name=if.then
>    Adding Instrumentation in BB Name=if.then5
>    Adding Instrumentation in BB Name=for.end
>    Adding Instrumentation in BB Name=for.end.loopexit
>
>    After a training run, we get profile data for the main function as
>    follows, but these count values are put into incorrect BBs in the second
>    compilation.
>    Block counts: [0, 300, 4, 1, 1]
>
>    $ opt -analyze -pgo-instr-use _all_combined.bc
>    -debug-only=pgo-instrumentation
>    Dump Function main Hash: 61483163021 after CFGMST
>    Number of Basic Blocks: 10
>    BB: FakeNode Index=0
>    BB: *for.body.lr.ph* <http://for.body.lr.ph/> Index=3
>    BB: if.end Index=6
>    BB: entry Index=1
>    BB: if.then Index=5
>    BB: for.body Index=4
>    BB: for.end.loopexit Index=9
>    BB: for.inc Index=8
>    BB: if.then5 Index=7
>    BB: for.end Index=2
>    Number of Edges: 14 (*: Instrument, C: CriticalEdge, -: Removed)
>    Edge 0: 8-->4 c W=247031
>    Edge 1: 6-->8 c W=159375
>    Edge 2: 4-->6 *c W=127500
>    Edge 3: 1-->2 c W=127058
>    Edge 4: 0-->1 W=135
>    Edge 5: 2-->0 * W=135
>    Edge 6: 4-->5 W=127
>    Edge 7: 5-->6 * W=127
>    Edge 8: 6-->7 W=95
>    Edge 9: 7-->8 * W=95
>    Edge 10: 3-->4 W=8
>    Edge 11: 9-->2 W=8
>    Edge 12: 1-->3 W=7
>    Edge 13: 8-->9 * W=7
>    5 counts
>    0: 0
>    1: 300
>    2: 4
>    3: 1
>    4: 1
>    SUM = 306
>    Split critical edge: 4 --> 6
>    Setting BB Name=for.body.if.end_crit_edge with CountValue=0
>    Setting BB Name=for.end with CountValue=300
>    Setting BB Name=if.then with CountValue=4
>    Setting BB Name=if.then5 with CountValue=1
>    Setting BB Name=for.end.loopexit with CountValue=1
>
>    The CountValue 300 should go to the BB=if.then (Index 5), not for.end
>    (Index 2). Actually because of this incorrect setting, the entry count of
>    the main function is set 300, instead of 1 (after populating the count
>    values).
>    The reason for this problem is that CFGMST edges are ordered in a
>    different way due to different weight values (edges 0 --> 1 and 2 --> 0 get
>    W=12 in the first compilation, while they get W=135 in the second
>    compilation). The weight values are computed based on block frequency info
>    and branch probability info, but somehow they produce different values
>    between the two compilations.
>
>
>
> Different  BFI produced for otherwise identical compilation is a bug we
> should fix (can cause other problems too). Can you file a bug about it?
>
> thanks,
>
> David
>
>
>
>    How can we assume that CFGMST is constructed in the same way between
>    the two compilations so that we can always set profile results into correct
>    basic blocks?
>
>    Thank you,
>    --Toshjio
>
>    _______________________________________________
>    LLVM Developers mailing list
> *llvm-dev at lists.llvm.org* <llvm-dev at lists.llvm.org>
> *http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev*
>    <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160321/476c17e4/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160321/476c17e4/attachment.gif>


More information about the llvm-dev mailing list