[llvm-dev] The state of IRPGO (3 remaining work items)

Sean Silva via llvm-dev llvm-dev at lists.llvm.org
Fri Jun 3 18:06:02 PDT 2016


On Thu, Jun 2, 2016 at 6:41 PM, Xinliang David Li <davidxl at google.com>
wrote:

>
>
> On Thu, Jun 2, 2016 at 5:30 PM, Sean Silva <chisophugis at gmail.com> wrote:
>
>>
>>
>> On Thu, Jun 2, 2016 at 2:51 PM, Frédéric Riss <friss at apple.com> wrote:
>>
>>>
>>> On Jun 2, 2016, at 12:10 AM, Sean Silva <chisophugis at gmail.com> wrote:
>>>
>>>
>>>
>>> On Wed, Jun 1, 2016 at 5:46 PM, Frédéric Riss <friss at apple.com> wrote:
>>>
>>>>
>>>> On Jun 1, 2016, at 1:46 PM, Sean Silva <chisophugis at gmail.com> wrote:
>>>>
>>>>
>>>>
>>>> On Tue, May 31, 2016 at 6:02 PM, Frédéric Riss <friss at apple.com> wrote:
>>>>
>>>>>
>>>>> On May 24, 2016, at 5:21 PM, Sean Silva <chisophugis at gmail.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On Tue, May 24, 2016 at 3:41 PM, Vedant Kumar <vsk at apple.com> wrote:
>>>>>
>>>>>>
>>>>>> > On May 23, 2016, at 8:56 PM, Xinliang David Li <davidxl at google.com>
>>>>>> wrote:
>>>>>> >
>>>>>> > On Mon, May 23, 2016 at 8:23 PM, Sean Silva <chisophugis at gmail.com>
>>>>>> wrote:
>>>>>> > Jake and I have been integrating IRPGO on PS4, and we've identified
>>>>>> 3 remaining work items.
>>>>>> >
>>>>>> > Sean, thanks for the write up. It matches very well with what we
>>>>>> think as well.
>>>>>>
>>>>>> + 1
>>>>>>
>>>>>>
>>>>>> > - Driver changes
>>>>>> >
>>>>>> > We'd like to make IRPGO the default on PS4. We also think that it
>>>>>> would be beneficial to make IRPGO the default PGO on all platforms
>>>>>> (coverage would continue to use FE instr as it does currently, of course).
>>>>>> In previous conversations (e.g. http://reviews.llvm.org/D15829) it
>>>>>> has come up that Apple have requirements that would prevent them from
>>>>>> moving to IRPGO as the default PGO, at least without a deprecation period
>>>>>> of one or two releases.
>>>>>>
>>>>>> Sean pointed out the problematic scenario in D15829 (in plan "C"):
>>>>>>
>>>>>> ```
>>>>>> All existing user workflows continue to work, except for workflows
>>>>>> that attempt to llvm-profdata merge some old frontend profile data (e.g.
>>>>>> they have checked-in to version control and represents some special
>>>>>> workload) with the profile data from new binaries.
>>>>>> ```
>>>>>>
>>>>>> We can address this issue by (1) making sure llvm-profdata emits a
>>>>>> helpful warning when merging an FE-based profile with an IR-based one, and
>>>>>> (2) keeping an option to use FE instrumentation for PGO. Having (2) helps
>>>>>> people who can't (or don't want) to switch to IR PGO.
>>>>>>
>>>>>>
>>>>>> > I'd like to get consensus on a path forward.
>>>>>> > As a point of discussion, how about we make IRPGO the default on
>>>>>> all platforms except Apple platforms.
>>>>>>
>>>>>> I'd really rather not introduce this inconsistency. I'm worried that
>>>>>> it might lead to Darwin becoming a second-tier platform for PGO.
>>>>>>
>>>>>> Fred (CC'd) is following up with some of our internal users to check
>>>>>> if we can change the default behavior of -fprofile-instr-generate. He
>>>>>> should be able to chime in on this soon.
>>>>>>
>>>>>
>>>>> Sorry it took me so long.
>>>>>
>>>>
>>>> Hi Fred,
>>>>
>>>> My understanding is that you were specifically investigating whether
>>>> Apple needed compatibility for merging indexed profiles. Is that
>>>> compatibility needed? The only compelling argument I have heard to continue
>>>> to expose FEPGO is that Apple may have a compatibility requirement for
>>>> merging indexed profiles from previous compiler versions.
>>>>
>>>>
>>>> Sorry no, my comment had nothing to do with merging profiles. I
>>>> understand that this will break, and it might very well be an issue for us,
>>>> but I think there is a more fundamental issue with the proposed plan. As
>>>> you bring it up though, this is a user visible breakage that shouldn’t be
>>>> disregarded completely.
>>>>
>>>
>>> Merging with existing indexed profiles is the only user-visible breakage
>>> AFAIK (this was discussed at length in http://reviews.llvm.org/D15829 and
>>> the corresponding email thread). Please provide concrete examples where
>>> things would break.
>>>
>>>
>>> I feel like we are talking past each other. It is a fact that “merging
>>> existing indexed profiles” will break. You consider it to be a reasonable
>>> breakage, I’m just saying that we shouldn't ignore this completely. It is a
>>> known breakage so it should be considered when making the decision about
>>> the new flags’ behavior.
>>> We advise users to commit their profiles to VCS, so this could actually
>>> be a real issue for them (Xcode won’t try to merge old/new profiles by
>>> itself though, but users could do this). For those kind of changes where
>>> the alternative is not a full drop-in replacement, I think deprecating the
>>> option and replacing it with a new one is a better methodology.
>>>
>>>
>>>
>>>>
>>>> Even if this is a requirement, then I still intend to make IRPGO the
>>>> default and only PGO going forward, at least on PS4. I think that doing the
>>>> same for all platforms in the upstream compiler probably makes sense as
>>>> well, since an internal Apple vendor compatibility requirement should not
>>>> penalize all users of the open source project.
>>>>
>>>>
>>>> Again, I’m not expressing an Apple requirement, just trying to discuss
>>>> the specifics of the proposed implementation. My goal is not to hinder
>>>> anything, and I want our platforms to be able to use IRPGO reliably if
>>>> users see the need for it.
>>>>
>>>
>>> What I'm saying is that besides reduced training overhead (and the
>>> inability to merge with older indexed profiles, which AFAIK is the only
>>> actual potential requirement that would need a deprecation period for
>>> FEPGO), IRPGO is basically "just a better PGO", so adding a frontend one
>>> (except as something purely during a deprecation period) is
>>> pointless. "just a better PGO" is what IRPGO is for my users. I don't want
>>> to have to have them deal with (and I don't want to support) FEPGO.
>>>
>>>
>>> Well we want to support FEPGO because it fits Xcode's workflow better
>>> (mainly it allows to do PGO + coverage at the same time).
>>>
>>> Anything that will cause the existing flag to continue to produce FEPGO
>>> on PS4 is not something that I'm really okay with. The reduced overhead of
>>> IRPGO is really important on PS4 (i.e. the difference between the
>>> instrumented game being playable or not). I really don't want to have to
>>> test the triple to determine the meaning of `-fprofile-instr-generate`
>>> (without `-fcoverage-mapping`).
>>>
>>>
>>> And I agree that IRPGO is better for your users, but the best workflow
>>> for your users is not necessarily the best workflow for every clang user. I
>>> totally agree with you that testing the triple to decide the meaning of the
>>> option is wrong to do upstream. The best default is not a platform choice,
>>> it’s a workflow decision.
>>>
>>>
>>>> I’ve discussed the change in behavior quiet extensively, and I after
>>>>> having changed my mind a couple times, I would argue in favor of keeping
>>>>> the current behavior for the existing flags. I think adding a new switch
>>>>> for IRPGO is a better option. The argument that weighted most on my opinion
>>>>> is the proposed interaction with -fcoverage-mapping, and it is not at all
>>>>> platform specific. With the proposed new behavior, turning coverage on and
>>>>> off in your build system will generate a binary with different performance
>>>>> characteristics and this feels really wrong.
>>>>>
>>>>
>>>> Bob already mentioned in the other thread that
>>>> `-fprofile-instr-generate -fcoverage-mapping` was sufficiently different
>>>> from `-fprofile-instr-generate` that `-fprofile-instr-generate
>>>> -fcoverage-mapping` was not an acceptable workaround that could be used for
>>>> enabling FEPGO during a transitionary period, so I'm not convinced that
>>>> your argument here makes sense.
>>>>
>>>>
>>>> I’m not sure what you’re referring to here, and I have a hard time
>>>> parsing the sentence. I suppose “was not an acceptable” should read “was an
>>>> acceptable”? I would be surprised that Bob ever agreed to completely
>>>> transition away from FEPGO. I didn’t even understand that getting rid of
>>>> FEPGO was on table as you seem to imply bellow.
>>>>
>>>
>>> No, it is written as intended. The backstory is in
>>> http://reviews.llvm.org/D15829 (and the corresponding email thread).
>>> The paragraph starting with "The coverage mapping adds considerable cost.”.
>>>
>>>
>>> Thanks for the pointer, it turns out this part of the thread never made
>>> it to my mailbox. What Bob is saying is that we need an option to turn on
>>> FEPGO without coverage, and this is basically exactly what I think we
>>> should do: add an explicit option to choose the instrumentation type, so
>>> that we can use FEPGO even without coverage in the new IRPGO world.
>>>
>>>
>>>> I also share David's opinion that this is not going to be an issue in
>>>> practice. I think it makes sense for PGO and coverage to have different
>>>> overheads. Coverage inherently has to trace all locations at source level,
>>>> while PGO has more freedom.
>>>>
>>>>
>>>> I’m sorry if I wasn’t clear, but I’m not talking about instrumentation
>>>> overhead, I’m talking about the performance of the binary generated using
>>>> the profiles. If we go the route of making the meaning of
>>>> -fprofile-instr-generate depend on whether -fcoverage-mapping gets passed,
>>>> then we change the kind of instrumentation and thus the input to the
>>>> optimizations behind the user’s back. I wouldn’t be surprised that using
>>>> profiles generated by FEPGO and IRPGO give you a final executable with
>>>> measurably different performance characteristics.
>>>>
>>>
>>> I think the point is that given the effort being put into IRPGO, the
>>> IRPGO version will always be a faster final executable.
>>>
>>>
>>> 2 things:
>>>  - Do you have strong evidence of this being true in all cases? I’m not
>>> looking forward to when I’ll transition Apple’s clang to IRPGO, because I
>>> know that we’ll have to spend time characterizing the performance of the
>>> compiler. It might be faster in all but a couple benchmarks, we still will
>>> need to investigate what happened there to see if we can fix it. I’m not
>>> saying it’s bad (I’m actually also looking forward to evaluate it, because
>>> it might give better results for us), but it’s definitely not a benign
>>> change that I would do without double checking the results. And as such, I
>>> also wouldn’t want the compiler to silently do this change behind my back.
>>>  - Even if it is 100% true, then users will see a degradation in
>>> performance if they add -fcoverage-mapping to their CFLAGS. I think this is
>>> just bad user experience. The user should be able to decide if he wants to
>>> be able to compromise some performance to be able to do PGO+coverage at the
>>> same time.
>>>
>>> Why provide a "worse" PGO option?
>>>
>>>
>>> Worse from your point of view. The resilience to compiler changes that
>>> FEPGO currently offers is a real feature even if it doesn’t have a place in
>>> Sony’s workflow.
>>>
>>>
>>>
>>>> If you’re tracking your performance, this can be really painful.
>>>> Recently we wasted days investigating performance regressions that were due
>>>> to buggy profiles. I strongly believe having an option seemingly unrelated
>>>> to PGO change this behavior is wrong and can cause actual pain for our end
>>>> users.
>>>>
>>>
>>> After a deprecation period we can force `-fprofile-instr-generate` and
>>> `-fcoverage-mapping` to be mutually exclusive if necessary. Does this solve
>>> your problem?
>>>
>>>
>>> We don’t want a “deprecation period”. Unless IRPGO evolves in something
>>> that can do precise coverage and PGO in the same run, FEPGO can’t be
>>> deprecated (as in slated for removal).
>>>
>>> Actually, I think it makes a lot of sense in some respects for
>>> `-fprofile-instr-generate` and `-fprofile-instr-generate
>>> -fcoverage-mapping` to be IRPGO and FEPGO/coverage. The difference from a
>>> user's perspective is basically "is the instrumentation inserted by the
>>> compiler constrained to have source-level coverage, or does the compiler
>>> not have this restriction". Although as I've said, I'm not a fan of
>>> supporting FEPGO in the long-term due to maintenance issues.
>>>
>>> Also note that things like the context-sensitivity obtained through
>>> pre-inlining (see Rong's original RFC) is simply not obtainable within a
>>> source-level instrumentation paradigm (even if we did something like the
>>> counter fusion discussed in "[llvm-dev] RFC: Pass to prune redundant
>>> profiling instrumentation" to reduce the overhead to that of IRPGO with
>>> pre-inlining). Thus FEPGO a.k.a. "coverage-level PGO" would nonetheless be
>>> at an inherent disadvantage.
>>>
>>>
>>>> Also, David's point about redundant work on FEPGO is a good one. We
>>>> don't want to continue maintaining two different PGO’s.
>>>>
>>>>
>>>> Are you implying that LLVM should drop FEPGO? It’s a totally sensible
>>>> thing to do to use your tests as training data for your profile generation.
>>>> It’s also a very valid thing to do to use your tests to do coverage. Xcode
>>>> does both of these things. I would see it a a big regression to not support
>>>> doing both at the same time (this would mean doubling compile+testing time
>>>> for users of both).
>>>>
>>>
>>> As David pointed out, training runs for PGO and coverage have different
>>> goals. I'm very skeptical of any testing that tries to do both at the same
>>> time, but this will continue to work (albeit without benefitting from any
>>> of the effort being put into IRPGO).
>>>
>>>
>>> For a high-end game, sure. Small apps are a different beast and it’s a
>>> reasonable thing to use your set of tests as training data. Again, all the
>>> workflows are not equivalent.
>>>
>>> As the instrumentation needs to stay there for coverage anyway, I expect
>>>> FEPGO to stay there and maintained (we care a lot about coverage). I’m not
>>>> saying that all the work going into IRPGO should be duplicated in FEPGO,
>>>> but what’s there and working should keep working.
>>>>
>>>
>>> For my users the reduced overhead of IRPGO is an important feature, and
>>> making it the default is important for that reason. Since most of the
>>> effort going into PGO is focused on IRPGO, this will lead to users using
>>> FEPGO ending up as a second-tier PGO, which Vedant said he specifically
>>> wanted to avoid. The only option to avoid this is for users to not be using
>>> FEPGO.
>>>
>>>
>>> I’d love to have everybody be able to enjoy the lower overhead, but if
>>> the answer to this is that they need to compile+train twice to get coverage
>>> and PGO, then the trade-off is wrong. I suppose you can agree that every
>>> user doesn’t have the same requirements as game developers. At least users
>>> should have the choice.
>>>
>>> Also, FEPGO has a lot of nice characteristics like resilience to IRGEN
>>>> changes. If you have archived profiles, then when you switch compilers your
>>>> performance shouldn’t degrade with FEPGO (modulo optimization bugs), while
>>>> it’s much more likely to degrade with IRPGO.
>>>>
>>>
>>> Note that this use case continues to work. I.e. we continue to apply
>>> existing frontend profiles correctly. including frontend profiles generated
>>> with -fcoverage-mapping, so that collecting coverage and PGO at the same
>>> time (although not advisable) still works. The only use case that breaks is
>>> merging existing indexed profiles, which is why we are specifically waiting
>>> for an answer on whether this is a requirement for you guys at Apple, which
>>> will determine what kind of deprecation period etc. will be needed before
>>> we can default it.
>>>
>>>
>>>> It overall looks like a much better option for people who do not need
>>>> the lower instrumentation overhead.
>>>>
>>>
>>> This is not just about lower instrumentation overhead. Things like the
>>> recently added static VP node allocation (which will e.g. make indirect
>>> callsite promotion for LTO'd kernels work) are other things are being
>>> missed out on.
>>>
>>>
>>>
>>>> I would actually make the IRPGO mode completely incompatible with the
>>>>> -fcoverage-mapping flag.
>>>>>
>>>>
>>>> I'm not sure what you mean by this. Nobody is proposing anything that
>>>> would make -fcoverage-mapping do anything related to IRPGO.
>>>>
>>>>
>>>> What I mean is that -f<whatever enables IRPGO> should error out when
>>>> passed at the same time as -fcoverage-mapping.
>>>>
>>>
>>> I think you're coming into this with the mindset that FEPGO will still
>>> be a possibility (outside of a build that is used for coverage mapping).
>>> I'm not convinced that we actually need to continue exposing that except as
>>> a weird thing in conjunction with coverage (and possibly for a deprecation
>>> period if users want to merge indexed profiles).
>>>
>>>
>>> FEPGO needs to be a possibility. And the instrumentation code needs to
>>> be there anyway for the coverage. Here’s a proposal:
>>>  - Add -fpgo-instr=[llvm|clang] to choose which kind of PGO
>>> instrumentation you want
>>>  - make -fpgo-instr=llvm incompatible with with -fcoverage-mapping
>>>  - deprecate -fprofile-instr-generate, and remove it after a couple
>>> releases
>>>
>>> (I really don’t care about the names)
>>>
>>> I think it is a better way forward to have -fprofile-instr-generate keep
>>> it’s current meaning and to have downstream toolchain providers have an
>>> internal patch to make it an alias to the one they prefer. Just because
>>> changing the meaning of options is a bad thing.
>>>
>>
>> Sure, I can deal with that. And actually this discussion has suggested
>> some good user-understandable names for the two different PGO's. I think
>> that "stable PGO" or "coverage-based PGO" or "source-level PGO" is a really
>> good name for FEPGO. "frontend" and "IR" aren't really meaningful to users.
>>
>> Based on what I have seen, FEPGO has the following reasons to keep it
>> around:
>> - it currently can be used together with coverage
>> - it has certain compatibility guarantees about profiles across compiler
>> versions
>>
>>
>>
>>> This also means that if the consensus is that -fprofile-instr-generate
>>> should really change its meaning to mean IRPGO, I’m open to having this
>>> internal patch on our side.
>>>
>>
>> Yeah, it sounds like someone is going to have to keep a "private patch"
>> to change the default. At that point doing a switch on the triple in
>> upstream seems preferable though :/
>>
>>
>> So I propose the following (which is equivalent to what you proposed, but
>> starting to put specific option names):
>> 1. Add -fprofile-instr-generate=stable and
>> -fprofile-instr-generate=unstable
>> a) Indexed profiles for -fprofile-instr-generate=stable are guaranteed to
>> be compatible across compiler releases (which will include existing
>> releases). Additionally, -fprofile-instr-generate=stable can be used in
>> combination with -fcoverage-mapping.
>> b) Indexed profiles for -fprofile-instr-generate=unstable are not
>> guaranteed to be compatible across compiler releases. It is an error to use
>> this in conjunction with -fcoverage-mapping
>> c) -fprofile-instr-generate defaults to one of the two in a way that
>> meets vendor needs, via private patches or an upstream switch on the triple
>> or whatever. It can eventually be deprecated or whatever.
>>
>> -fprofile-instr-generate=stable would basically correspond to FEPGO / FE
>> coverage instrumentation. -fprofile-instr-generate=unstable would basically
>> correspond to IRPGO.
>>
>> What do you think?
>>
>
>
> Sounds fine to me, though I am not a fan of using unstable in the option.
> I think a more meaningful way (that capture the essence of the difference)
> is the following naming:
> 1) FEPGO:  -fprofile-instr-generate=source or
> -fprofile-instr-generate=region
> 2) IR: -fprofile-instr-generate=cfg or -fprofile-instr-generate=graph
>
> Also since -fprofile-instr-generate= form is already used to specify raw
> profile path, we may need a different driver option.
>

Oops!


> Alternatives include
> 1) -fprofile-instrument=<...> -- this maps directly to the cc1 option we
> have today
> or
> 2) -fpgo-instr=<> -- suggested by Fred or
> 3) -fpgo-instr-method=<...>
>

I will probably use one of these in the patch.

-- Sean Silva


>
>
> Having the driver level option is the first good step forward.  In the
> near future, when performance of IRPGO is further tuned (e.g, better
> integration with inliner, more runtime perf win with -flto=thin, -flto
> etc), and the interests of the IRPGO is better aligned, we can revisit the
> default.
>
> thanks,
>
> David
>
>
>
>
>>
>> -- Sean Silva
>>
>>
>>>
>>> I also think a new option is cleaner (again I don’t care about the
>>> name), but if people feel that -fprofile-instr-generate=[llvm|clang] is
>>> easier, then this works for me too.
>>>
>>> Fred
>>>
>>>
>>> -- Sean Silva
>>>
>>>
>>>>
>>>> Thanks!
>>>> Fred
>>>>
>>>> -- Sean Silva
>>>>
>>>>
>>>>
>>>>>
>>>>> Fred
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>> At its core I don't think -fprofile-instr-generate *implies* FE-based
>>>>>> instrumentation. So, I'd like to see the driver do this (on all platforms):
>>>>>>
>>>>>>   * -fprofile-instr-generate: IR instrumentation
>>>>>>   * -fprofile-instr-generate=IR: IR instrumentation
>>>>>>   * -fprofile-instr-generate=FE: FE instrumentation
>>>>>>   * -fprofile-instr-generate -fcoverage-mapping: FE + coverage
>>>>>> instrumentation
>>>>>>
>>>>>> It's a bit ugly because the meaning of -fprofile-instr-generate
>>>>>> becomes context-sensitive. But, (1) it doesn't break existing common
>>>>>> workflows and (2) it makes it easier to ship IRPGO. The big caveat here is
>>>>>> that we'll need to wait a bit and see if our internal users are OK with
>>>>>> this.
>>>>>>
>>>>>
>>>>> Is there a reason to even have the possibility for FEPGO in the long
>>>>> run? From what I can tell, at most we would add a
>>>>> -fuse-the-old-pgo-because-i-want-to-merge-with-old-profiles option to
>>>>> hold people over until they can regenerate their profiles with the current
>>>>> compiler. We can add a flag to control what pre-instrumentation is done to
>>>>> retain the source-level robustness of FEPGO (e.g.
>>>>> -fpgo-no-simplify-before-instrumenting or something).
>>>>>
>>>>>
>>>>>> One alternative is to introduce a separate driver flag for IRPGO.
>>>>>> This might not work well for Sony's existing users. I'd be interested in
>>>>>> any feedback about this approach.
>>>>>>
>>>>>
>>>>> Personally, I would prefer to maintaining command line compatibility
>>>>> for PGO in Clang (i.e. users don't have to modify their build systems).
>>>>>
>>>>>
>>>>> -- Sean Silva
>>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>> > I really don't like fragmenting things like this (e.g. if a
>>>>>> third-party tests "clang's" PGO they will get something different depending
>>>>>> on the platform), but I don't see another way given Apple's constraints.
>>>>>> >
>>>>>> > I'd like to see IRPGO to be the default as well, but the first
>>>>>> thing we need is a driver level option to make the switch (prof-gen) --
>>>>>> currently we rely on -Xclang option to switch between two modes, which is
>>>>>> less than ideal.
>>>>>> >
>>>>>> > If the concern from Apple is that the old profile still need to
>>>>>> work, then this is problem already solved. The reason is that
>>>>>> -fprofile-instr-use can automatically detect the type of the profile and
>>>>>> switch the mode.
>>>>>>
>>>>>> It's not just that. As Sean pointed out, we're concerned about old
>>>>>> profiles inter-operating poorly with new ones.
>>>>>>
>>>>>> thanks,
>>>>>> vedant
>>>>>>
>>>>>>
>>>>>> > - Pre-instrumentation passes
>>>>>> >
>>>>>> > Pre-instrumentation optimization has been critical for reducing the
>>>>>> overhead of PGO for the PS4 games we tested (as expected). However, in our
>>>>>> measurements (and we are glad to provide more info) the main benefit was
>>>>>> inlining (also as expected). A simple pass of inlining at threshold 100
>>>>>> appeared to give all the benefits. Even inlining at threshold 0 gave almost
>>>>>> all the benefits. For example, the passes initially proposed in
>>>>>> http://reviews.llvm.org/D15828did not improve over just inlining
>>>>>> with threshold 100.
>>>>>> >
>>>>>> > (due to PR27299 we also need to add simplifycfg after inlining to
>>>>>> clean up, but this doesn't affect the instrumentation overhead in our
>>>>>> measurements)
>>>>>> >
>>>>>> > Bottom line: for our use cases, inlining does all the work, but
>>>>>> we're not opposed to having more passes, which might be beneficial for
>>>>>> non-game workloads (which is most code).
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > Yes, Rong is re-collecting performance data before submitting the
>>>>>> patch.
>>>>>> >
>>>>>> > - Warnings
>>>>>> >
>>>>>> > We identified 3 classes of issues which manifest as spammy warnings
>>>>>> when applying profile data with IRPGO (these affect FEPGO also I believe,
>>>>>> but we looked in depth at IRPGO):
>>>>>> >
>>>>>> > 1. The main concerning one is that getPGOFuncName mangles the
>>>>>> filename into the counter name. This causes us to get
>>>>>> instrprof_error::unknown_function when the pgo-use build is done in a
>>>>>> different build directory from the training build (which is a reasonable
>>>>>> thing to support). In this situation, PGO data is useless for all `static`
>>>>>> functions (and as a byproduct results in a huge volume of warnings).
>>>>>> >
>>>>>> > This can be enhanced with an user option to override the behavior.
>>>>>> Can you help filing a tracking bug?
>>>>>> >
>>>>>> >
>>>>>> > 2. In different TU's, pre-instr inlining might make different
>>>>>> inlining decisions (for example, different functions may be available for
>>>>>> inlining), causing hash mismatch errors (instrprof_error::hash_mismatch).
>>>>>> In building a large game, we only saw 8 instance of this, so it is not as
>>>>>> severe as 1, but would be good to fix.
>>>>>> >
>>>>>> >
>>>>>> > Rong has a patch addressing that -- will submit after cleanup pass
>>>>>> change is done.
>>>>>> >
>>>>>> >
>>>>>> > 3. A .cpp file may be compiled and put into an archive, but then
>>>>>> not selected by the linker and will therefore not result in a counter in
>>>>>> the profraw. When compiling this file with pgo-use,
>>>>>> instrprof_error::unknown_function will result and a warning will be emitted.
>>>>>> >
>>>>>> > yes -- this is a common problem to other compilers as well.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > Case 1 can be fixed using a function hash or other unique
>>>>>> identifier instead of a file path. David, in D20195 you mentioned that Rong
>>>>>> was working on a patch that would fix 2; we are looking forward to that.
>>>>>> >
>>>>>> >
>>>>>> > Right.
>>>>>> >
>>>>>> > For 3, I unfortunately do not know of any solution. I don't think
>>>>>> there is a way for us to make this warning reliable in the face of this
>>>>>> circumstance. So my conclusion is that instrprof_error::unknown_function at
>>>>>> least must be defaulted to off unfortunately.
>>>>>> >
>>>>>> > yes, this can be annoying. If the warnings can be buffered, then
>>>>>> the compiler can check if this is due to missing profile for the whole file
>>>>>> and can reduce the warnings into one single warning (source file has no
>>>>>> profile data).  Making it off by default sounds fine to me too if it is too
>>>>>> noisy.
>>>>>> >
>>>>>> > thanks,
>>>>>> >
>>>>>> > David
>>>>>> >
>>>>>> >
>>>>>> > -- Sean Silva
>>>>>>
>>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160603/d745790a/attachment-0001.html>


More information about the llvm-dev mailing list