[llvm-dev] The state of IRPGO (3 remaining work items)
Sean Silva via llvm-dev
llvm-dev at lists.llvm.org
Mon Jun 13 15:05:25 PDT 2016
Quick update. I've gotten derailed from posting a patch for this due to
focusing on higher priority PGO inlining work. No ETA.
-- Sean Silva
On Fri, Jun 3, 2016 at 6:06 PM, Sean Silva <chisophugis at gmail.com> wrote:
>
>
> On Thu, Jun 2, 2016 at 6:41 PM, Xinliang David Li <davidxl at google.com>
> wrote:
>
>>
>>
>> On Thu, Jun 2, 2016 at 5:30 PM, Sean Silva <chisophugis at gmail.com> wrote:
>>
>>>
>>>
>>> On Thu, Jun 2, 2016 at 2:51 PM, Frédéric Riss <friss at apple.com> wrote:
>>>
>>>>
>>>> On Jun 2, 2016, at 12:10 AM, Sean Silva <chisophugis at gmail.com> wrote:
>>>>
>>>>
>>>>
>>>> On Wed, Jun 1, 2016 at 5:46 PM, Frédéric Riss <friss at apple.com> wrote:
>>>>
>>>>>
>>>>> On Jun 1, 2016, at 1:46 PM, Sean Silva <chisophugis at gmail.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On Tue, May 31, 2016 at 6:02 PM, Frédéric Riss <friss at apple.com>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>> On May 24, 2016, at 5:21 PM, Sean Silva <chisophugis at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, May 24, 2016 at 3:41 PM, Vedant Kumar <vsk at apple.com> wrote:
>>>>>>
>>>>>>>
>>>>>>> > On May 23, 2016, at 8:56 PM, Xinliang David Li <davidxl at google.com>
>>>>>>> wrote:
>>>>>>> >
>>>>>>> > On Mon, May 23, 2016 at 8:23 PM, Sean Silva <chisophugis at gmail.com>
>>>>>>> wrote:
>>>>>>> > Jake and I have been integrating IRPGO on PS4, and we've
>>>>>>> identified 3 remaining work items.
>>>>>>> >
>>>>>>> > Sean, thanks for the write up. It matches very well with what we
>>>>>>> think as well.
>>>>>>>
>>>>>>> + 1
>>>>>>>
>>>>>>>
>>>>>>> > - Driver changes
>>>>>>> >
>>>>>>> > We'd like to make IRPGO the default on PS4. We also think that it
>>>>>>> would be beneficial to make IRPGO the default PGO on all platforms
>>>>>>> (coverage would continue to use FE instr as it does currently, of course).
>>>>>>> In previous conversations (e.g. http://reviews.llvm.org/D15829) it
>>>>>>> has come up that Apple have requirements that would prevent them from
>>>>>>> moving to IRPGO as the default PGO, at least without a deprecation period
>>>>>>> of one or two releases.
>>>>>>>
>>>>>>> Sean pointed out the problematic scenario in D15829 (in plan "C"):
>>>>>>>
>>>>>>> ```
>>>>>>> All existing user workflows continue to work, except for workflows
>>>>>>> that attempt to llvm-profdata merge some old frontend profile data (e.g.
>>>>>>> they have checked-in to version control and represents some special
>>>>>>> workload) with the profile data from new binaries.
>>>>>>> ```
>>>>>>>
>>>>>>> We can address this issue by (1) making sure llvm-profdata emits a
>>>>>>> helpful warning when merging an FE-based profile with an IR-based one, and
>>>>>>> (2) keeping an option to use FE instrumentation for PGO. Having (2) helps
>>>>>>> people who can't (or don't want) to switch to IR PGO.
>>>>>>>
>>>>>>>
>>>>>>> > I'd like to get consensus on a path forward.
>>>>>>> > As a point of discussion, how about we make IRPGO the default on
>>>>>>> all platforms except Apple platforms.
>>>>>>>
>>>>>>> I'd really rather not introduce this inconsistency. I'm worried that
>>>>>>> it might lead to Darwin becoming a second-tier platform for PGO.
>>>>>>>
>>>>>>> Fred (CC'd) is following up with some of our internal users to check
>>>>>>> if we can change the default behavior of -fprofile-instr-generate. He
>>>>>>> should be able to chime in on this soon.
>>>>>>>
>>>>>>
>>>>>> Sorry it took me so long.
>>>>>>
>>>>>
>>>>> Hi Fred,
>>>>>
>>>>> My understanding is that you were specifically investigating whether
>>>>> Apple needed compatibility for merging indexed profiles. Is that
>>>>> compatibility needed? The only compelling argument I have heard to continue
>>>>> to expose FEPGO is that Apple may have a compatibility requirement for
>>>>> merging indexed profiles from previous compiler versions.
>>>>>
>>>>>
>>>>> Sorry no, my comment had nothing to do with merging profiles. I
>>>>> understand that this will break, and it might very well be an issue for us,
>>>>> but I think there is a more fundamental issue with the proposed plan. As
>>>>> you bring it up though, this is a user visible breakage that shouldn’t be
>>>>> disregarded completely.
>>>>>
>>>>
>>>> Merging with existing indexed profiles is the only user-visible
>>>> breakage AFAIK (this was discussed at length in
>>>> http://reviews.llvm.org/D15829 and the corresponding email thread).
>>>> Please provide concrete examples where things would break.
>>>>
>>>>
>>>> I feel like we are talking past each other. It is a fact that “merging
>>>> existing indexed profiles” will break. You consider it to be a reasonable
>>>> breakage, I’m just saying that we shouldn't ignore this completely. It is a
>>>> known breakage so it should be considered when making the decision about
>>>> the new flags’ behavior.
>>>> We advise users to commit their profiles to VCS, so this could actually
>>>> be a real issue for them (Xcode won’t try to merge old/new profiles by
>>>> itself though, but users could do this). For those kind of changes where
>>>> the alternative is not a full drop-in replacement, I think deprecating the
>>>> option and replacing it with a new one is a better methodology.
>>>>
>>>>
>>>>
>>>>>
>>>>> Even if this is a requirement, then I still intend to make IRPGO the
>>>>> default and only PGO going forward, at least on PS4. I think that doing the
>>>>> same for all platforms in the upstream compiler probably makes sense as
>>>>> well, since an internal Apple vendor compatibility requirement should not
>>>>> penalize all users of the open source project.
>>>>>
>>>>>
>>>>> Again, I’m not expressing an Apple requirement, just trying to discuss
>>>>> the specifics of the proposed implementation. My goal is not to hinder
>>>>> anything, and I want our platforms to be able to use IRPGO reliably if
>>>>> users see the need for it.
>>>>>
>>>>
>>>> What I'm saying is that besides reduced training overhead (and the
>>>> inability to merge with older indexed profiles, which AFAIK is the only
>>>> actual potential requirement that would need a deprecation period for
>>>> FEPGO), IRPGO is basically "just a better PGO", so adding a frontend one
>>>> (except as something purely during a deprecation period) is
>>>> pointless. "just a better PGO" is what IRPGO is for my users. I don't want
>>>> to have to have them deal with (and I don't want to support) FEPGO.
>>>>
>>>>
>>>> Well we want to support FEPGO because it fits Xcode's workflow better
>>>> (mainly it allows to do PGO + coverage at the same time).
>>>>
>>>> Anything that will cause the existing flag to continue to produce FEPGO
>>>> on PS4 is not something that I'm really okay with. The reduced overhead of
>>>> IRPGO is really important on PS4 (i.e. the difference between the
>>>> instrumented game being playable or not). I really don't want to have to
>>>> test the triple to determine the meaning of `-fprofile-instr-generate`
>>>> (without `-fcoverage-mapping`).
>>>>
>>>>
>>>> And I agree that IRPGO is better for your users, but the best workflow
>>>> for your users is not necessarily the best workflow for every clang user. I
>>>> totally agree with you that testing the triple to decide the meaning of the
>>>> option is wrong to do upstream. The best default is not a platform choice,
>>>> it’s a workflow decision.
>>>>
>>>>
>>>>> I’ve discussed the change in behavior quiet extensively, and I after
>>>>>> having changed my mind a couple times, I would argue in favor of keeping
>>>>>> the current behavior for the existing flags. I think adding a new switch
>>>>>> for IRPGO is a better option. The argument that weighted most on my opinion
>>>>>> is the proposed interaction with -fcoverage-mapping, and it is not at all
>>>>>> platform specific. With the proposed new behavior, turning coverage on and
>>>>>> off in your build system will generate a binary with different performance
>>>>>> characteristics and this feels really wrong.
>>>>>>
>>>>>
>>>>> Bob already mentioned in the other thread that
>>>>> `-fprofile-instr-generate -fcoverage-mapping` was sufficiently different
>>>>> from `-fprofile-instr-generate` that `-fprofile-instr-generate
>>>>> -fcoverage-mapping` was not an acceptable workaround that could be used for
>>>>> enabling FEPGO during a transitionary period, so I'm not convinced that
>>>>> your argument here makes sense.
>>>>>
>>>>>
>>>>> I’m not sure what you’re referring to here, and I have a hard time
>>>>> parsing the sentence. I suppose “was not an acceptable” should read “was an
>>>>> acceptable”? I would be surprised that Bob ever agreed to completely
>>>>> transition away from FEPGO. I didn’t even understand that getting rid of
>>>>> FEPGO was on table as you seem to imply bellow.
>>>>>
>>>>
>>>> No, it is written as intended. The backstory is in
>>>> http://reviews.llvm.org/D15829 (and the corresponding email thread).
>>>> The paragraph starting with "The coverage mapping adds considerable cost.”.
>>>>
>>>>
>>>> Thanks for the pointer, it turns out this part of the thread never made
>>>> it to my mailbox. What Bob is saying is that we need an option to turn on
>>>> FEPGO without coverage, and this is basically exactly what I think we
>>>> should do: add an explicit option to choose the instrumentation type, so
>>>> that we can use FEPGO even without coverage in the new IRPGO world.
>>>>
>>>>
>>>>> I also share David's opinion that this is not going to be an issue in
>>>>> practice. I think it makes sense for PGO and coverage to have different
>>>>> overheads. Coverage inherently has to trace all locations at source level,
>>>>> while PGO has more freedom.
>>>>>
>>>>>
>>>>> I’m sorry if I wasn’t clear, but I’m not talking about instrumentation
>>>>> overhead, I’m talking about the performance of the binary generated using
>>>>> the profiles. If we go the route of making the meaning of
>>>>> -fprofile-instr-generate depend on whether -fcoverage-mapping gets passed,
>>>>> then we change the kind of instrumentation and thus the input to the
>>>>> optimizations behind the user’s back. I wouldn’t be surprised that using
>>>>> profiles generated by FEPGO and IRPGO give you a final executable with
>>>>> measurably different performance characteristics.
>>>>>
>>>>
>>>> I think the point is that given the effort being put into IRPGO, the
>>>> IRPGO version will always be a faster final executable.
>>>>
>>>>
>>>> 2 things:
>>>> - Do you have strong evidence of this being true in all cases? I’m not
>>>> looking forward to when I’ll transition Apple’s clang to IRPGO, because I
>>>> know that we’ll have to spend time characterizing the performance of the
>>>> compiler. It might be faster in all but a couple benchmarks, we still will
>>>> need to investigate what happened there to see if we can fix it. I’m not
>>>> saying it’s bad (I’m actually also looking forward to evaluate it, because
>>>> it might give better results for us), but it’s definitely not a benign
>>>> change that I would do without double checking the results. And as such, I
>>>> also wouldn’t want the compiler to silently do this change behind my back.
>>>> - Even if it is 100% true, then users will see a degradation in
>>>> performance if they add -fcoverage-mapping to their CFLAGS. I think this is
>>>> just bad user experience. The user should be able to decide if he wants to
>>>> be able to compromise some performance to be able to do PGO+coverage at the
>>>> same time.
>>>>
>>>> Why provide a "worse" PGO option?
>>>>
>>>>
>>>> Worse from your point of view. The resilience to compiler changes that
>>>> FEPGO currently offers is a real feature even if it doesn’t have a place in
>>>> Sony’s workflow.
>>>>
>>>>
>>>>
>>>>> If you’re tracking your performance, this can be really painful.
>>>>> Recently we wasted days investigating performance regressions that were due
>>>>> to buggy profiles. I strongly believe having an option seemingly unrelated
>>>>> to PGO change this behavior is wrong and can cause actual pain for our end
>>>>> users.
>>>>>
>>>>
>>>> After a deprecation period we can force `-fprofile-instr-generate` and
>>>> `-fcoverage-mapping` to be mutually exclusive if necessary. Does this solve
>>>> your problem?
>>>>
>>>>
>>>> We don’t want a “deprecation period”. Unless IRPGO evolves in something
>>>> that can do precise coverage and PGO in the same run, FEPGO can’t be
>>>> deprecated (as in slated for removal).
>>>>
>>>> Actually, I think it makes a lot of sense in some respects for
>>>> `-fprofile-instr-generate` and `-fprofile-instr-generate
>>>> -fcoverage-mapping` to be IRPGO and FEPGO/coverage. The difference from a
>>>> user's perspective is basically "is the instrumentation inserted by the
>>>> compiler constrained to have source-level coverage, or does the compiler
>>>> not have this restriction". Although as I've said, I'm not a fan of
>>>> supporting FEPGO in the long-term due to maintenance issues.
>>>>
>>>> Also note that things like the context-sensitivity obtained through
>>>> pre-inlining (see Rong's original RFC) is simply not obtainable within a
>>>> source-level instrumentation paradigm (even if we did something like the
>>>> counter fusion discussed in "[llvm-dev] RFC: Pass to prune redundant
>>>> profiling instrumentation" to reduce the overhead to that of IRPGO with
>>>> pre-inlining). Thus FEPGO a.k.a. "coverage-level PGO" would nonetheless be
>>>> at an inherent disadvantage.
>>>>
>>>>
>>>>> Also, David's point about redundant work on FEPGO is a good one. We
>>>>> don't want to continue maintaining two different PGO’s.
>>>>>
>>>>>
>>>>> Are you implying that LLVM should drop FEPGO? It’s a totally sensible
>>>>> thing to do to use your tests as training data for your profile generation.
>>>>> It’s also a very valid thing to do to use your tests to do coverage. Xcode
>>>>> does both of these things. I would see it a a big regression to not support
>>>>> doing both at the same time (this would mean doubling compile+testing time
>>>>> for users of both).
>>>>>
>>>>
>>>> As David pointed out, training runs for PGO and coverage have different
>>>> goals. I'm very skeptical of any testing that tries to do both at the same
>>>> time, but this will continue to work (albeit without benefitting from any
>>>> of the effort being put into IRPGO).
>>>>
>>>>
>>>> For a high-end game, sure. Small apps are a different beast and it’s a
>>>> reasonable thing to use your set of tests as training data. Again, all the
>>>> workflows are not equivalent.
>>>>
>>>> As the instrumentation needs to stay there for coverage anyway, I
>>>>> expect FEPGO to stay there and maintained (we care a lot about coverage).
>>>>> I’m not saying that all the work going into IRPGO should be duplicated in
>>>>> FEPGO, but what’s there and working should keep working.
>>>>>
>>>>
>>>> For my users the reduced overhead of IRPGO is an important feature, and
>>>> making it the default is important for that reason. Since most of the
>>>> effort going into PGO is focused on IRPGO, this will lead to users using
>>>> FEPGO ending up as a second-tier PGO, which Vedant said he specifically
>>>> wanted to avoid. The only option to avoid this is for users to not be using
>>>> FEPGO.
>>>>
>>>>
>>>> I’d love to have everybody be able to enjoy the lower overhead, but if
>>>> the answer to this is that they need to compile+train twice to get coverage
>>>> and PGO, then the trade-off is wrong. I suppose you can agree that every
>>>> user doesn’t have the same requirements as game developers. At least users
>>>> should have the choice.
>>>>
>>>> Also, FEPGO has a lot of nice characteristics like resilience to IRGEN
>>>>> changes. If you have archived profiles, then when you switch compilers your
>>>>> performance shouldn’t degrade with FEPGO (modulo optimization bugs), while
>>>>> it’s much more likely to degrade with IRPGO.
>>>>>
>>>>
>>>> Note that this use case continues to work. I.e. we continue to apply
>>>> existing frontend profiles correctly. including frontend profiles generated
>>>> with -fcoverage-mapping, so that collecting coverage and PGO at the same
>>>> time (although not advisable) still works. The only use case that breaks is
>>>> merging existing indexed profiles, which is why we are specifically waiting
>>>> for an answer on whether this is a requirement for you guys at Apple, which
>>>> will determine what kind of deprecation period etc. will be needed before
>>>> we can default it.
>>>>
>>>>
>>>>> It overall looks like a much better option for people who do not need
>>>>> the lower instrumentation overhead.
>>>>>
>>>>
>>>> This is not just about lower instrumentation overhead. Things like the
>>>> recently added static VP node allocation (which will e.g. make indirect
>>>> callsite promotion for LTO'd kernels work) are other things are being
>>>> missed out on.
>>>>
>>>>
>>>>
>>>>> I would actually make the IRPGO mode completely incompatible with the
>>>>>> -fcoverage-mapping flag.
>>>>>>
>>>>>
>>>>> I'm not sure what you mean by this. Nobody is proposing anything that
>>>>> would make -fcoverage-mapping do anything related to IRPGO.
>>>>>
>>>>>
>>>>> What I mean is that -f<whatever enables IRPGO> should error out when
>>>>> passed at the same time as -fcoverage-mapping.
>>>>>
>>>>
>>>> I think you're coming into this with the mindset that FEPGO will still
>>>> be a possibility (outside of a build that is used for coverage mapping).
>>>> I'm not convinced that we actually need to continue exposing that except as
>>>> a weird thing in conjunction with coverage (and possibly for a deprecation
>>>> period if users want to merge indexed profiles).
>>>>
>>>>
>>>> FEPGO needs to be a possibility. And the instrumentation code needs to
>>>> be there anyway for the coverage. Here’s a proposal:
>>>> - Add -fpgo-instr=[llvm|clang] to choose which kind of PGO
>>>> instrumentation you want
>>>> - make -fpgo-instr=llvm incompatible with with -fcoverage-mapping
>>>> - deprecate -fprofile-instr-generate, and remove it after a couple
>>>> releases
>>>>
>>>> (I really don’t care about the names)
>>>>
>>>> I think it is a better way forward to have -fprofile-instr-generate
>>>> keep it’s current meaning and to have downstream toolchain providers have
>>>> an internal patch to make it an alias to the one they prefer. Just because
>>>> changing the meaning of options is a bad thing.
>>>>
>>>
>>> Sure, I can deal with that. And actually this discussion has suggested
>>> some good user-understandable names for the two different PGO's. I think
>>> that "stable PGO" or "coverage-based PGO" or "source-level PGO" is a really
>>> good name for FEPGO. "frontend" and "IR" aren't really meaningful to users.
>>>
>>> Based on what I have seen, FEPGO has the following reasons to keep it
>>> around:
>>> - it currently can be used together with coverage
>>> - it has certain compatibility guarantees about profiles across compiler
>>> versions
>>>
>>>
>>>
>>>> This also means that if the consensus is that -fprofile-instr-generate
>>>> should really change its meaning to mean IRPGO, I’m open to having this
>>>> internal patch on our side.
>>>>
>>>
>>> Yeah, it sounds like someone is going to have to keep a "private patch"
>>> to change the default. At that point doing a switch on the triple in
>>> upstream seems preferable though :/
>>>
>>>
>>> So I propose the following (which is equivalent to what you proposed,
>>> but starting to put specific option names):
>>> 1. Add -fprofile-instr-generate=stable and
>>> -fprofile-instr-generate=unstable
>>> a) Indexed profiles for -fprofile-instr-generate=stable are guaranteed
>>> to be compatible across compiler releases (which will include existing
>>> releases). Additionally, -fprofile-instr-generate=stable can be used in
>>> combination with -fcoverage-mapping.
>>> b) Indexed profiles for -fprofile-instr-generate=unstable are not
>>> guaranteed to be compatible across compiler releases. It is an error to use
>>> this in conjunction with -fcoverage-mapping
>>> c) -fprofile-instr-generate defaults to one of the two in a way that
>>> meets vendor needs, via private patches or an upstream switch on the triple
>>> or whatever. It can eventually be deprecated or whatever.
>>>
>>> -fprofile-instr-generate=stable would basically correspond to FEPGO / FE
>>> coverage instrumentation. -fprofile-instr-generate=unstable would basically
>>> correspond to IRPGO.
>>>
>>> What do you think?
>>>
>>
>>
>> Sounds fine to me, though I am not a fan of using unstable in the option.
>> I think a more meaningful way (that capture the essence of the difference)
>> is the following naming:
>> 1) FEPGO: -fprofile-instr-generate=source or
>> -fprofile-instr-generate=region
>> 2) IR: -fprofile-instr-generate=cfg or -fprofile-instr-generate=graph
>>
>> Also since -fprofile-instr-generate= form is already used to specify raw
>> profile path, we may need a different driver option.
>>
>
> Oops!
>
>
>> Alternatives include
>> 1) -fprofile-instrument=<...> -- this maps directly to the cc1 option we
>> have today
>> or
>> 2) -fpgo-instr=<> -- suggested by Fred or
>> 3) -fpgo-instr-method=<...>
>>
>
> I will probably use one of these in the patch.
>
> -- Sean Silva
>
>
>>
>>
>> Having the driver level option is the first good step forward. In the
>> near future, when performance of IRPGO is further tuned (e.g, better
>> integration with inliner, more runtime perf win with -flto=thin, -flto
>> etc), and the interests of the IRPGO is better aligned, we can revisit the
>> default.
>>
>> thanks,
>>
>> David
>>
>>
>>
>>
>>>
>>> -- Sean Silva
>>>
>>>
>>>>
>>>> I also think a new option is cleaner (again I don’t care about the
>>>> name), but if people feel that -fprofile-instr-generate=[llvm|clang] is
>>>> easier, then this works for me too.
>>>>
>>>> Fred
>>>>
>>>>
>>>> -- Sean Silva
>>>>
>>>>
>>>>>
>>>>> Thanks!
>>>>> Fred
>>>>>
>>>>> -- Sean Silva
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>> Fred
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> At its core I don't think -fprofile-instr-generate *implies*
>>>>>>> FE-based instrumentation. So, I'd like to see the driver do this (on all
>>>>>>> platforms):
>>>>>>>
>>>>>>> * -fprofile-instr-generate: IR instrumentation
>>>>>>> * -fprofile-instr-generate=IR: IR instrumentation
>>>>>>> * -fprofile-instr-generate=FE: FE instrumentation
>>>>>>> * -fprofile-instr-generate -fcoverage-mapping: FE + coverage
>>>>>>> instrumentation
>>>>>>>
>>>>>>> It's a bit ugly because the meaning of -fprofile-instr-generate
>>>>>>> becomes context-sensitive. But, (1) it doesn't break existing common
>>>>>>> workflows and (2) it makes it easier to ship IRPGO. The big caveat here is
>>>>>>> that we'll need to wait a bit and see if our internal users are OK with
>>>>>>> this.
>>>>>>>
>>>>>>
>>>>>> Is there a reason to even have the possibility for FEPGO in the long
>>>>>> run? From what I can tell, at most we would add a
>>>>>> -fuse-the-old-pgo-because-i-want-to-merge-with-old-profiles option
>>>>>> to hold people over until they can regenerate their profiles with the
>>>>>> current compiler. We can add a flag to control what pre-instrumentation is
>>>>>> done to retain the source-level robustness of FEPGO (e.g.
>>>>>> -fpgo-no-simplify-before-instrumenting or something).
>>>>>>
>>>>>>
>>>>>>> One alternative is to introduce a separate driver flag for IRPGO.
>>>>>>> This might not work well for Sony's existing users. I'd be interested in
>>>>>>> any feedback about this approach.
>>>>>>>
>>>>>>
>>>>>> Personally, I would prefer to maintaining command line compatibility
>>>>>> for PGO in Clang (i.e. users don't have to modify their build systems).
>>>>>>
>>>>>>
>>>>>> -- Sean Silva
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > I really don't like fragmenting things like this (e.g. if a
>>>>>>> third-party tests "clang's" PGO they will get something different depending
>>>>>>> on the platform), but I don't see another way given Apple's constraints.
>>>>>>> >
>>>>>>> > I'd like to see IRPGO to be the default as well, but the first
>>>>>>> thing we need is a driver level option to make the switch (prof-gen) --
>>>>>>> currently we rely on -Xclang option to switch between two modes, which is
>>>>>>> less than ideal.
>>>>>>> >
>>>>>>> > If the concern from Apple is that the old profile still need to
>>>>>>> work, then this is problem already solved. The reason is that
>>>>>>> -fprofile-instr-use can automatically detect the type of the profile and
>>>>>>> switch the mode.
>>>>>>>
>>>>>>> It's not just that. As Sean pointed out, we're concerned about old
>>>>>>> profiles inter-operating poorly with new ones.
>>>>>>>
>>>>>>> thanks,
>>>>>>> vedant
>>>>>>>
>>>>>>>
>>>>>>> > - Pre-instrumentation passes
>>>>>>> >
>>>>>>> > Pre-instrumentation optimization has been critical for reducing
>>>>>>> the overhead of PGO for the PS4 games we tested (as expected). However, in
>>>>>>> our measurements (and we are glad to provide more info) the main benefit
>>>>>>> was inlining (also as expected). A simple pass of inlining at threshold 100
>>>>>>> appeared to give all the benefits. Even inlining at threshold 0 gave almost
>>>>>>> all the benefits. For example, the passes initially proposed in
>>>>>>> http://reviews.llvm.org/D15828did not improve over just inlining
>>>>>>> with threshold 100.
>>>>>>> >
>>>>>>> > (due to PR27299 we also need to add simplifycfg after inlining to
>>>>>>> clean up, but this doesn't affect the instrumentation overhead in our
>>>>>>> measurements)
>>>>>>> >
>>>>>>> > Bottom line: for our use cases, inlining does all the work, but
>>>>>>> we're not opposed to having more passes, which might be beneficial for
>>>>>>> non-game workloads (which is most code).
>>>>>>> >
>>>>>>> >
>>>>>>> >
>>>>>>> > Yes, Rong is re-collecting performance data before submitting the
>>>>>>> patch.
>>>>>>> >
>>>>>>> > - Warnings
>>>>>>> >
>>>>>>> > We identified 3 classes of issues which manifest as spammy
>>>>>>> warnings when applying profile data with IRPGO (these affect FEPGO also I
>>>>>>> believe, but we looked in depth at IRPGO):
>>>>>>> >
>>>>>>> > 1. The main concerning one is that getPGOFuncName mangles the
>>>>>>> filename into the counter name. This causes us to get
>>>>>>> instrprof_error::unknown_function when the pgo-use build is done in a
>>>>>>> different build directory from the training build (which is a reasonable
>>>>>>> thing to support). In this situation, PGO data is useless for all `static`
>>>>>>> functions (and as a byproduct results in a huge volume of warnings).
>>>>>>> >
>>>>>>> > This can be enhanced with an user option to override the behavior.
>>>>>>> Can you help filing a tracking bug?
>>>>>>> >
>>>>>>> >
>>>>>>> > 2. In different TU's, pre-instr inlining might make different
>>>>>>> inlining decisions (for example, different functions may be available for
>>>>>>> inlining), causing hash mismatch errors (instrprof_error::hash_mismatch).
>>>>>>> In building a large game, we only saw 8 instance of this, so it is not as
>>>>>>> severe as 1, but would be good to fix.
>>>>>>> >
>>>>>>> >
>>>>>>> > Rong has a patch addressing that -- will submit after cleanup pass
>>>>>>> change is done.
>>>>>>> >
>>>>>>> >
>>>>>>> > 3. A .cpp file may be compiled and put into an archive, but then
>>>>>>> not selected by the linker and will therefore not result in a counter in
>>>>>>> the profraw. When compiling this file with pgo-use,
>>>>>>> instrprof_error::unknown_function will result and a warning will be emitted.
>>>>>>> >
>>>>>>> > yes -- this is a common problem to other compilers as well.
>>>>>>> >
>>>>>>> >
>>>>>>> >
>>>>>>> > Case 1 can be fixed using a function hash or other unique
>>>>>>> identifier instead of a file path. David, in D20195 you mentioned that Rong
>>>>>>> was working on a patch that would fix 2; we are looking forward to that.
>>>>>>> >
>>>>>>> >
>>>>>>> > Right.
>>>>>>> >
>>>>>>> > For 3, I unfortunately do not know of any solution. I don't think
>>>>>>> there is a way for us to make this warning reliable in the face of this
>>>>>>> circumstance. So my conclusion is that instrprof_error::unknown_function at
>>>>>>> least must be defaulted to off unfortunately.
>>>>>>> >
>>>>>>> > yes, this can be annoying. If the warnings can be buffered, then
>>>>>>> the compiler can check if this is due to missing profile for the whole file
>>>>>>> and can reduce the warnings into one single warning (source file has no
>>>>>>> profile data). Making it off by default sounds fine to me too if it is too
>>>>>>> noisy.
>>>>>>> >
>>>>>>> > thanks,
>>>>>>> >
>>>>>>> > David
>>>>>>> >
>>>>>>> >
>>>>>>> > -- Sean Silva
>>>>>>>
>>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160613/f257e341/attachment-0001.html>
More information about the llvm-dev
mailing list