[llvm-dev] Instrumented BB in PGO

Toshio Suganuma via llvm-dev llvm-dev at lists.llvm.org
Mon Mar 21 22:24:07 PDT 2016


Hi David,

Thank you.
I just submitted a bug report 27024 (PGO instrumentation profile data is
not reflected in correct basic blocks).

Thank you,
--Toshio



From:	Xinliang David Li <xinliangli at gmail.com>
To:	Toshio Suganuma/Japan/IBM at IBMJP
Cc:	llvm-dev <llvm-dev at lists.llvm.org>, Rong Xu <xur at google.com>
Date:	2016/03/22 12:04
Subject:	Re: [llvm-dev] Instrumented BB in PGO





On Mon, Mar 21, 2016 at 7:19 PM, Toshio Suganuma via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
  Hello,

  I have a question regarding PGO instrumented BBs (I use IR-level
  instrumentation).

  It seems that instrumented BBs do not match between the two compilations
  for profile-gen and profile-use for some cases. Here is an example from
  SPECcpu 2006 lbm (a simple case consisting of just two modules).
  In the first compilation, we have 5 instrumentation points for the main
  function as follows:

  $ opt -pgo-instr-gen -instrprof _all_combined.bc -o _all_combined_inst.bc
  -debug-only=pgo-instrumentation
  Dump Function main Hash: 61483163021 after CFGMST
  Number of Basic Blocks: 10
  BB: FakeNode Index=0
  BB: if.then Index=5
  BB: for.body Index=4
  BB: for.body.lr.ph Index=3
  BB: entry Index=1
  BB: for.inc Index=8
  BB: if.then5 Index=7
  BB: if.end Index=6
  BB: for.end Index=2
  BB: for.end.loopexit Index=9
  Number of Edges: 14 (*: Instrument, C: CriticalEdge, -: Removed)
  Edge 0: 8-->4 c W=247031
  Edge 1: 6-->8 c W=159375
  Edge 2: 4-->6 *c W=127500
  Edge 3: 1-->2 c W=4500
  Edge 4: 4-->5 W=127
  Edge 5: 5-->6 * W=127
  Edge 6: 6-->7 W=95
  Edge 7: 7-->8 * W=95
  Edge 8: 0-->1 W=12
  Edge 9: 2-->0 * W=12
  Edge 10: 3-->4 W=8
  Edge 11: 9-->2 W=8
  Edge 12: 1-->3 W=7
  Edge 13: 8-->9 * W=7
  Split critical edge: 4 --> 6
  Adding Instrumentation in BB Name=for.body.if.end_crit_edge
  Adding Instrumentation in BB Name=if.then
  Adding Instrumentation in BB Name=if.then5
  Adding Instrumentation in BB Name=for.end
  Adding Instrumentation in BB Name=for.end.loopexit

  After a training run, we get profile data for the main function as
  follows, but these count values are put into incorrect BBs in the second
  compilation.
  Block counts: [0, 300, 4, 1, 1]

  $ opt -analyze -pgo-instr-use _all_combined.bc
  -debug-only=pgo-instrumentation
  Dump Function main Hash: 61483163021 after CFGMST
  Number of Basic Blocks: 10
  BB: FakeNode Index=0
  BB: for.body.lr.ph Index=3
  BB: if.end Index=6
  BB: entry Index=1
  BB: if.then Index=5
  BB: for.body Index=4
  BB: for.end.loopexit Index=9
  BB: for.inc Index=8
  BB: if.then5 Index=7
  BB: for.end Index=2
  Number of Edges: 14 (*: Instrument, C: CriticalEdge, -: Removed)
  Edge 0: 8-->4 c W=247031
  Edge 1: 6-->8 c W=159375
  Edge 2: 4-->6 *c W=127500
  Edge 3: 1-->2 c W=127058
  Edge 4: 0-->1 W=135
  Edge 5: 2-->0 * W=135
  Edge 6: 4-->5 W=127
  Edge 7: 5-->6 * W=127
  Edge 8: 6-->7 W=95
  Edge 9: 7-->8 * W=95
  Edge 10: 3-->4 W=8
  Edge 11: 9-->2 W=8
  Edge 12: 1-->3 W=7
  Edge 13: 8-->9 * W=7
  5 counts
  0: 0
  1: 300
  2: 4
  3: 1
  4: 1
  SUM = 306
  Split critical edge: 4 --> 6
  Setting BB Name=for.body.if.end_crit_edge with CountValue=0
  Setting BB Name=for.end with CountValue=300
  Setting BB Name=if.then with CountValue=4
  Setting BB Name=if.then5 with CountValue=1
  Setting BB Name=for.end.loopexit with CountValue=1

  The CountValue 300 should go to the BB=if.then (Index 5), not for.end
  (Index 2). Actually because of this incorrect setting, the entry count of
  the main function is set 300, instead of 1 (after populating the count
  values).
  The reason for this problem is that CFGMST edges are ordered in a
  different way due to different weight values (edges 0 --> 1 and 2 --> 0
  get W=12 in the first compilation, while they get W=135 in the second
  compilation). The weight values are computed based on block frequency
  info and branch probability info, but somehow they produce different
  values between the two compilations.



Different  BFI produced for otherwise identical compilation is a bug we
should fix (can cause other problems too). Can you file a bug about it?

thanks,

David


  How can we assume that CFGMST is constructed in the same way between the
  two compilations so that we can always set profile results into correct
  basic blocks?

  Thank you,
  --Toshjio

  _______________________________________________
  LLVM Developers mailing list
  llvm-dev at lists.llvm.org
  http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160322/d7621f33/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160322/d7621f33/attachment.gif>


More information about the llvm-dev mailing list