[llvm-dev] The state of IRPGO (3 remaining work items)

Jake VanAdrighem via llvm-dev llvm-dev at lists.llvm.org
Thu Jun 23 01:39:13 PDT 2016


On Mon, Jun 13, 2016 at 3:07 PM, Xinliang David Li <davidxl at google.com>
wrote:

> No problem. Inliner work is certainly higher priority. Rong can help with
> the option related work.
>
>
FYI I've started working on a patch for the driver changes and would be
happy to take care of it. I'll post the patch in the next week or so.

Jake


> thanks,
> David
>
> On Mon, Jun 13, 2016 at 3:05 PM, Sean Silva <chisophugis at gmail.com> wrote:
>
>> Quick update. I've gotten derailed from posting a patch for this due to
>> focusing on higher priority PGO inlining work. No ETA.
>>
>> -- Sean Silva
>>
>>
>> On Fri, Jun 3, 2016 at 6:06 PM, Sean Silva <chisophugis at gmail.com> wrote:
>>
>>>
>>>
>>> On Thu, Jun 2, 2016 at 6:41 PM, Xinliang David Li <davidxl at google.com>
>>> wrote:
>>>
>>>>
>>>>
>>>> On Thu, Jun 2, 2016 at 5:30 PM, Sean Silva <chisophugis at gmail.com>
>>>> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Thu, Jun 2, 2016 at 2:51 PM, Frédéric Riss <friss at apple.com> wrote:
>>>>>
>>>>>>
>>>>>> On Jun 2, 2016, at 12:10 AM, Sean Silva <chisophugis at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Jun 1, 2016 at 5:46 PM, Frédéric Riss <friss at apple.com>
>>>>>> wrote:
>>>>>>
>>>>>>>
>>>>>>> On Jun 1, 2016, at 1:46 PM, Sean Silva <chisophugis at gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Tue, May 31, 2016 at 6:02 PM, Frédéric Riss <friss at apple.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>> On May 24, 2016, at 5:21 PM, Sean Silva <chisophugis at gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, May 24, 2016 at 3:41 PM, Vedant Kumar <vsk at apple.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>> > On May 23, 2016, at 8:56 PM, Xinliang David Li <
>>>>>>>>> davidxl at google.com> wrote:
>>>>>>>>> >
>>>>>>>>> > On Mon, May 23, 2016 at 8:23 PM, Sean Silva <
>>>>>>>>> chisophugis at gmail.com> wrote:
>>>>>>>>> > Jake and I have been integrating IRPGO on PS4, and we've
>>>>>>>>> identified 3 remaining work items.
>>>>>>>>> >
>>>>>>>>> > Sean, thanks for the write up. It matches very well with what we
>>>>>>>>> think as well.
>>>>>>>>>
>>>>>>>>> + 1
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> > - Driver changes
>>>>>>>>> >
>>>>>>>>> > We'd like to make IRPGO the default on PS4. We also think that
>>>>>>>>> it would be beneficial to make IRPGO the default PGO on all platforms
>>>>>>>>> (coverage would continue to use FE instr as it does currently, of course).
>>>>>>>>> In previous conversations (e.g. http://reviews.llvm.org/D15829)
>>>>>>>>> it has come up that Apple have requirements that would prevent them from
>>>>>>>>> moving to IRPGO as the default PGO, at least without a deprecation period
>>>>>>>>> of one or two releases.
>>>>>>>>>
>>>>>>>>> Sean pointed out the problematic scenario in D15829 (in plan "C"):
>>>>>>>>>
>>>>>>>>> ```
>>>>>>>>> All existing user workflows continue to work, except for workflows
>>>>>>>>> that attempt to llvm-profdata merge some old frontend profile data (e.g.
>>>>>>>>> they have checked-in to version control and represents some special
>>>>>>>>> workload) with the profile data from new binaries.
>>>>>>>>> ```
>>>>>>>>>
>>>>>>>>> We can address this issue by (1) making sure llvm-profdata emits a
>>>>>>>>> helpful warning when merging an FE-based profile with an IR-based one, and
>>>>>>>>> (2) keeping an option to use FE instrumentation for PGO. Having (2) helps
>>>>>>>>> people who can't (or don't want) to switch to IR PGO.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> > I'd like to get consensus on a path forward.
>>>>>>>>> > As a point of discussion, how about we make IRPGO the default on
>>>>>>>>> all platforms except Apple platforms.
>>>>>>>>>
>>>>>>>>> I'd really rather not introduce this inconsistency. I'm worried
>>>>>>>>> that it might lead to Darwin becoming a second-tier platform for PGO.
>>>>>>>>>
>>>>>>>>> Fred (CC'd) is following up with some of our internal users to
>>>>>>>>> check if we can change the default behavior of -fprofile-instr-generate. He
>>>>>>>>> should be able to chime in on this soon.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Sorry it took me so long.
>>>>>>>>
>>>>>>>
>>>>>>> Hi Fred,
>>>>>>>
>>>>>>> My understanding is that you were specifically investigating whether
>>>>>>> Apple needed compatibility for merging indexed profiles. Is that
>>>>>>> compatibility needed? The only compelling argument I have heard to continue
>>>>>>> to expose FEPGO is that Apple may have a compatibility requirement for
>>>>>>> merging indexed profiles from previous compiler versions.
>>>>>>>
>>>>>>>
>>>>>>> Sorry no, my comment had nothing to do with merging profiles. I
>>>>>>> understand that this will break, and it might very well be an issue for us,
>>>>>>> but I think there is a more fundamental issue with the proposed plan. As
>>>>>>> you bring it up though, this is a user visible breakage that shouldn’t be
>>>>>>> disregarded completely.
>>>>>>>
>>>>>>
>>>>>> Merging with existing indexed profiles is the only user-visible
>>>>>> breakage AFAIK (this was discussed at length in
>>>>>> http://reviews.llvm.org/D15829 and the corresponding email thread).
>>>>>> Please provide concrete examples where things would break.
>>>>>>
>>>>>>
>>>>>> I feel like we are talking past each other. It is a fact that
>>>>>> “merging existing indexed profiles” will break. You consider it to be a
>>>>>> reasonable breakage, I’m just saying that we shouldn't ignore this
>>>>>> completely. It is a known breakage so it should be considered when making
>>>>>> the decision about the new flags’ behavior.
>>>>>> We advise users to commit their profiles to VCS, so this could
>>>>>> actually be a real issue for them (Xcode won’t try to merge old/new
>>>>>> profiles by itself though, but users could do this). For those kind of
>>>>>> changes where the alternative is not a full drop-in replacement, I think
>>>>>> deprecating the option and replacing it with a new one is a better
>>>>>> methodology.
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Even if this is a requirement, then I still intend to make IRPGO the
>>>>>>> default and only PGO going forward, at least on PS4. I think that doing the
>>>>>>> same for all platforms in the upstream compiler probably makes sense as
>>>>>>> well, since an internal Apple vendor compatibility requirement should not
>>>>>>> penalize all users of the open source project.
>>>>>>>
>>>>>>>
>>>>>>> Again, I’m not expressing an Apple requirement, just trying to
>>>>>>> discuss the specifics of the proposed implementation. My goal is not to
>>>>>>> hinder anything, and I want our platforms to be able to use IRPGO reliably
>>>>>>> if users see the need for it.
>>>>>>>
>>>>>>
>>>>>> What I'm saying is that besides reduced training overhead (and the
>>>>>> inability to merge with older indexed profiles, which AFAIK is the only
>>>>>> actual potential requirement that would need a deprecation period for
>>>>>> FEPGO), IRPGO is basically "just a better PGO", so adding a frontend one
>>>>>> (except as something purely during a deprecation period) is
>>>>>> pointless. "just a better PGO" is what IRPGO is for my users. I don't want
>>>>>> to have to have them deal with (and I don't want to support) FEPGO.
>>>>>>
>>>>>>
>>>>>> Well we want to support FEPGO because it fits Xcode's workflow better
>>>>>> (mainly it allows to do PGO + coverage at the same time).
>>>>>>
>>>>>> Anything that will cause the existing flag to continue to produce
>>>>>> FEPGO on PS4 is not something that I'm really okay with. The reduced
>>>>>> overhead of IRPGO is really important on PS4 (i.e. the difference between
>>>>>> the instrumented game being playable or not). I really don't want to have
>>>>>> to test the triple to determine the meaning of `-fprofile-instr-generate`
>>>>>> (without `-fcoverage-mapping`).
>>>>>>
>>>>>>
>>>>>> And I agree that IRPGO is better for your users, but the best
>>>>>> workflow for your users is not necessarily the best workflow for every
>>>>>> clang user. I totally agree with you that testing the triple to decide the
>>>>>> meaning of the option is wrong to do upstream. The best default is not a
>>>>>> platform choice, it’s a workflow decision.
>>>>>>
>>>>>>
>>>>>>> I’ve discussed the change in behavior quiet extensively, and I after
>>>>>>>> having changed my mind a couple times, I would argue in favor of keeping
>>>>>>>> the current behavior for the existing flags. I think adding a new switch
>>>>>>>> for IRPGO is a better option. The argument that weighted most on my opinion
>>>>>>>> is the proposed interaction with -fcoverage-mapping, and it is not at all
>>>>>>>> platform specific. With the proposed new behavior, turning coverage on and
>>>>>>>> off in your build system will generate a binary with different performance
>>>>>>>> characteristics and this feels really wrong.
>>>>>>>>
>>>>>>>
>>>>>>> Bob already mentioned in the other thread that
>>>>>>> `-fprofile-instr-generate -fcoverage-mapping` was sufficiently different
>>>>>>> from `-fprofile-instr-generate` that `-fprofile-instr-generate
>>>>>>> -fcoverage-mapping` was not an acceptable workaround that could be used for
>>>>>>> enabling FEPGO during a transitionary period, so I'm not convinced that
>>>>>>> your argument here makes sense.
>>>>>>>
>>>>>>>
>>>>>>> I’m not sure what you’re referring to here, and I have a hard time
>>>>>>> parsing the sentence. I suppose “was not an acceptable” should read “was an
>>>>>>> acceptable”? I would be surprised that Bob ever agreed to completely
>>>>>>> transition away from FEPGO. I didn’t even understand that getting rid of
>>>>>>> FEPGO was on table as you seem to imply bellow.
>>>>>>>
>>>>>>
>>>>>> No, it is written as intended. The backstory is in
>>>>>> http://reviews.llvm.org/D15829 (and the corresponding email thread).
>>>>>> The paragraph starting with "The coverage mapping adds considerable cost.”.
>>>>>>
>>>>>>
>>>>>> Thanks for the pointer, it turns out this part of the thread never
>>>>>> made it to my mailbox. What Bob is saying is that we need an option to turn
>>>>>> on FEPGO without coverage, and this is basically exactly what I think we
>>>>>> should do: add an explicit option to choose the instrumentation type, so
>>>>>> that we can use FEPGO even without coverage in the new IRPGO world.
>>>>>>
>>>>>>
>>>>>>> I also share David's opinion that this is not going to be an issue
>>>>>>> in practice. I think it makes sense for PGO and coverage to have different
>>>>>>> overheads. Coverage inherently has to trace all locations at source level,
>>>>>>> while PGO has more freedom.
>>>>>>>
>>>>>>>
>>>>>>> I’m sorry if I wasn’t clear, but I’m not talking about
>>>>>>> instrumentation overhead, I’m talking about the performance of the binary
>>>>>>> generated using the profiles. If we go the route of making the meaning of
>>>>>>> -fprofile-instr-generate depend on whether -fcoverage-mapping gets passed,
>>>>>>> then we change the kind of instrumentation and thus the input to the
>>>>>>> optimizations behind the user’s back. I wouldn’t be surprised that using
>>>>>>> profiles generated by FEPGO and IRPGO give you a final executable with
>>>>>>> measurably different performance characteristics.
>>>>>>>
>>>>>>
>>>>>> I think the point is that given the effort being put into IRPGO, the
>>>>>> IRPGO version will always be a faster final executable.
>>>>>>
>>>>>>
>>>>>> 2 things:
>>>>>>  - Do you have strong evidence of this being true in all cases? I’m
>>>>>> not looking forward to when I’ll transition Apple’s clang to IRPGO, because
>>>>>> I know that we’ll have to spend time characterizing the performance of the
>>>>>> compiler. It might be faster in all but a couple benchmarks, we still will
>>>>>> need to investigate what happened there to see if we can fix it. I’m not
>>>>>> saying it’s bad (I’m actually also looking forward to evaluate it, because
>>>>>> it might give better results for us), but it’s definitely not a benign
>>>>>> change that I would do without double checking the results. And as such, I
>>>>>> also wouldn’t want the compiler to silently do this change behind my back.
>>>>>>  - Even if it is 100% true, then users will see a degradation in
>>>>>> performance if they add -fcoverage-mapping to their CFLAGS. I think this is
>>>>>> just bad user experience. The user should be able to decide if he wants to
>>>>>> be able to compromise some performance to be able to do PGO+coverage at the
>>>>>> same time.
>>>>>>
>>>>>> Why provide a "worse" PGO option?
>>>>>>
>>>>>>
>>>>>> Worse from your point of view. The resilience to compiler changes
>>>>>> that FEPGO currently offers is a real feature even if it doesn’t have a
>>>>>> place in Sony’s workflow.
>>>>>>
>>>>>>
>>>>>>
>>>>>>> If you’re tracking your performance, this can be really painful.
>>>>>>> Recently we wasted days investigating performance regressions that were due
>>>>>>> to buggy profiles. I strongly believe having an option seemingly unrelated
>>>>>>> to PGO change this behavior is wrong and can cause actual pain for our end
>>>>>>> users.
>>>>>>>
>>>>>>
>>>>>> After a deprecation period we can force `-fprofile-instr-generate`
>>>>>> and `-fcoverage-mapping` to be mutually exclusive if necessary. Does this
>>>>>> solve your problem?
>>>>>>
>>>>>>
>>>>>> We don’t want a “deprecation period”. Unless IRPGO evolves in
>>>>>> something that can do precise coverage and PGO in the same run, FEPGO can’t
>>>>>> be deprecated (as in slated for removal).
>>>>>>
>>>>>> Actually, I think it makes a lot of sense in some respects for
>>>>>> `-fprofile-instr-generate` and `-fprofile-instr-generate
>>>>>> -fcoverage-mapping` to be IRPGO and FEPGO/coverage. The difference from a
>>>>>> user's perspective is basically "is the instrumentation inserted by the
>>>>>> compiler constrained to have source-level coverage, or does the compiler
>>>>>> not have this restriction". Although as I've said, I'm not a fan of
>>>>>> supporting FEPGO in the long-term due to maintenance issues.
>>>>>>
>>>>>> Also note that things like the context-sensitivity obtained through
>>>>>> pre-inlining (see Rong's original RFC) is simply not obtainable within a
>>>>>> source-level instrumentation paradigm (even if we did something like the
>>>>>> counter fusion discussed in "[llvm-dev] RFC: Pass to prune redundant
>>>>>> profiling instrumentation" to reduce the overhead to that of IRPGO with
>>>>>> pre-inlining). Thus FEPGO a.k.a. "coverage-level PGO" would nonetheless be
>>>>>> at an inherent disadvantage.
>>>>>>
>>>>>>
>>>>>>> Also, David's point about redundant work on FEPGO is a good one. We
>>>>>>> don't want to continue maintaining two different PGO’s.
>>>>>>>
>>>>>>>
>>>>>>> Are you implying that LLVM should drop FEPGO? It’s a totally
>>>>>>> sensible thing to do to use your tests as training data for your profile
>>>>>>> generation. It’s also a very valid thing to do to use your tests to do
>>>>>>> coverage. Xcode does both of these things. I would see it a a big
>>>>>>> regression to not support doing both at the same time (this would mean
>>>>>>> doubling compile+testing time for users of both).
>>>>>>>
>>>>>>
>>>>>> As David pointed out, training runs for PGO and coverage have
>>>>>> different goals. I'm very skeptical of any testing that tries to do both at
>>>>>> the same time, but this will continue to work (albeit without benefitting
>>>>>> from any of the effort being put into IRPGO).
>>>>>>
>>>>>>
>>>>>> For a high-end game, sure. Small apps are a different beast and it’s
>>>>>> a reasonable thing to use your set of tests as training data. Again, all
>>>>>> the workflows are not equivalent.
>>>>>>
>>>>>> As the instrumentation needs to stay there for coverage anyway, I
>>>>>>> expect FEPGO to stay there and maintained (we care a lot about coverage).
>>>>>>> I’m not saying that all the work going into IRPGO should be duplicated in
>>>>>>> FEPGO, but what’s there and working should keep working.
>>>>>>>
>>>>>>
>>>>>> For my users the reduced overhead of IRPGO is an important feature,
>>>>>> and making it the default is important for that reason. Since most of the
>>>>>> effort going into PGO is focused on IRPGO, this will lead to users using
>>>>>> FEPGO ending up as a second-tier PGO, which Vedant said he specifically
>>>>>> wanted to avoid. The only option to avoid this is for users to not be using
>>>>>> FEPGO.
>>>>>>
>>>>>>
>>>>>> I’d love to have everybody be able to enjoy the lower overhead, but
>>>>>> if the answer to this is that they need to compile+train twice to get
>>>>>> coverage and PGO, then the trade-off is wrong. I suppose you can agree that
>>>>>> every user doesn’t have the same requirements as game developers. At least
>>>>>> users should have the choice.
>>>>>>
>>>>>> Also, FEPGO has a lot of nice characteristics like resilience to
>>>>>>> IRGEN changes. If you have archived profiles, then when you switch
>>>>>>> compilers your performance shouldn’t degrade with FEPGO (modulo
>>>>>>> optimization bugs), while it’s much more likely to degrade with IRPGO.
>>>>>>>
>>>>>>
>>>>>> Note that this use case continues to work. I.e. we continue to apply
>>>>>> existing frontend profiles correctly. including frontend profiles generated
>>>>>> with -fcoverage-mapping, so that collecting coverage and PGO at the same
>>>>>> time (although not advisable) still works. The only use case that breaks is
>>>>>> merging existing indexed profiles, which is why we are specifically waiting
>>>>>> for an answer on whether this is a requirement for you guys at Apple, which
>>>>>> will determine what kind of deprecation period etc. will be needed before
>>>>>> we can default it.
>>>>>>
>>>>>>
>>>>>>> It overall looks like a much better option for people who do not
>>>>>>> need the lower instrumentation overhead.
>>>>>>>
>>>>>>
>>>>>> This is not just about lower instrumentation overhead. Things like
>>>>>> the recently added static VP node allocation (which will e.g. make indirect
>>>>>> callsite promotion for LTO'd kernels work) are other things are being
>>>>>> missed out on.
>>>>>>
>>>>>>
>>>>>>
>>>>>>> I would actually make the IRPGO mode completely incompatible with
>>>>>>>> the -fcoverage-mapping flag.
>>>>>>>>
>>>>>>>
>>>>>>> I'm not sure what you mean by this. Nobody is proposing anything
>>>>>>> that would make -fcoverage-mapping do anything related to IRPGO.
>>>>>>>
>>>>>>>
>>>>>>> What I mean is that -f<whatever enables IRPGO> should error out when
>>>>>>> passed at the same time as -fcoverage-mapping.
>>>>>>>
>>>>>>
>>>>>> I think you're coming into this with the mindset that FEPGO will
>>>>>> still be a possibility (outside of a build that is used for coverage
>>>>>> mapping). I'm not convinced that we actually need to continue exposing that
>>>>>> except as a weird thing in conjunction with coverage (and possibly for a
>>>>>> deprecation period if users want to merge indexed profiles).
>>>>>>
>>>>>>
>>>>>> FEPGO needs to be a possibility. And the instrumentation code needs
>>>>>> to be there anyway for the coverage. Here’s a proposal:
>>>>>>  - Add -fpgo-instr=[llvm|clang] to choose which kind of PGO
>>>>>> instrumentation you want
>>>>>>  - make -fpgo-instr=llvm incompatible with with -fcoverage-mapping
>>>>>>  - deprecate -fprofile-instr-generate, and remove it after a couple
>>>>>> releases
>>>>>>
>>>>>> (I really don’t care about the names)
>>>>>>
>>>>>> I think it is a better way forward to have -fprofile-instr-generate
>>>>>> keep it’s current meaning and to have downstream toolchain providers have
>>>>>> an internal patch to make it an alias to the one they prefer. Just because
>>>>>> changing the meaning of options is a bad thing.
>>>>>>
>>>>>
>>>>> Sure, I can deal with that. And actually this discussion has suggested
>>>>> some good user-understandable names for the two different PGO's. I think
>>>>> that "stable PGO" or "coverage-based PGO" or "source-level PGO" is a really
>>>>> good name for FEPGO. "frontend" and "IR" aren't really meaningful to users.
>>>>>
>>>>> Based on what I have seen, FEPGO has the following reasons to keep it
>>>>> around:
>>>>> - it currently can be used together with coverage
>>>>> - it has certain compatibility guarantees about profiles across
>>>>> compiler versions
>>>>>
>>>>>
>>>>>
>>>>>> This also means that if the consensus is that
>>>>>> -fprofile-instr-generate should really change its meaning to mean IRPGO,
>>>>>> I’m open to having this internal patch on our side.
>>>>>>
>>>>>
>>>>> Yeah, it sounds like someone is going to have to keep a "private
>>>>> patch" to change the default. At that point doing a switch on the triple in
>>>>> upstream seems preferable though :/
>>>>>
>>>>>
>>>>> So I propose the following (which is equivalent to what you proposed,
>>>>> but starting to put specific option names):
>>>>> 1. Add -fprofile-instr-generate=stable and
>>>>> -fprofile-instr-generate=unstable
>>>>> a) Indexed profiles for -fprofile-instr-generate=stable are guaranteed
>>>>> to be compatible across compiler releases (which will include existing
>>>>> releases). Additionally, -fprofile-instr-generate=stable can be used in
>>>>> combination with -fcoverage-mapping.
>>>>> b) Indexed profiles for -fprofile-instr-generate=unstable are not
>>>>> guaranteed to be compatible across compiler releases. It is an error to use
>>>>> this in conjunction with -fcoverage-mapping
>>>>> c) -fprofile-instr-generate defaults to one of the two in a way that
>>>>> meets vendor needs, via private patches or an upstream switch on the triple
>>>>> or whatever. It can eventually be deprecated or whatever.
>>>>>
>>>>> -fprofile-instr-generate=stable would basically correspond to FEPGO /
>>>>> FE coverage instrumentation. -fprofile-instr-generate=unstable would
>>>>> basically correspond to IRPGO.
>>>>>
>>>>> What do you think?
>>>>>
>>>>
>>>>
>>>> Sounds fine to me, though I am not a fan of using unstable in the
>>>> option. I think a more meaningful way (that capture the essence of the
>>>> difference) is the following naming:
>>>> 1) FEPGO:  -fprofile-instr-generate=source or
>>>> -fprofile-instr-generate=region
>>>> 2) IR: -fprofile-instr-generate=cfg or -fprofile-instr-generate=graph
>>>>
>>>> Also since -fprofile-instr-generate= form is already used to specify
>>>> raw profile path, we may need a different driver option.
>>>>
>>>
>>> Oops!
>>>
>>>
>>>> Alternatives include
>>>> 1) -fprofile-instrument=<...> -- this maps directly to the cc1 option
>>>> we have today
>>>> or
>>>> 2) -fpgo-instr=<> -- suggested by Fred or
>>>> 3) -fpgo-instr-method=<...>
>>>>
>>>
>>> I will probably use one of these in the patch.
>>>
>>> -- Sean Silva
>>>
>>>
>>>>
>>>>
>>>> Having the driver level option is the first good step forward.  In the
>>>> near future, when performance of IRPGO is further tuned (e.g, better
>>>> integration with inliner, more runtime perf win with -flto=thin, -flto
>>>> etc), and the interests of the IRPGO is better aligned, we can revisit the
>>>> default.
>>>>
>>>> thanks,
>>>>
>>>> David
>>>>
>>>>
>>>>
>>>>
>>>>>
>>>>> -- Sean Silva
>>>>>
>>>>>
>>>>>>
>>>>>> I also think a new option is cleaner (again I don’t care about the
>>>>>> name), but if people feel that -fprofile-instr-generate=[llvm|clang] is
>>>>>> easier, then this works for me too.
>>>>>>
>>>>>> Fred
>>>>>>
>>>>>>
>>>>>> -- Sean Silva
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Thanks!
>>>>>>> Fred
>>>>>>>
>>>>>>> -- Sean Silva
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> Fred
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> At its core I don't think -fprofile-instr-generate *implies*
>>>>>>>>> FE-based instrumentation. So, I'd like to see the driver do this (on all
>>>>>>>>> platforms):
>>>>>>>>>
>>>>>>>>>   * -fprofile-instr-generate: IR instrumentation
>>>>>>>>>   * -fprofile-instr-generate=IR: IR instrumentation
>>>>>>>>>   * -fprofile-instr-generate=FE: FE instrumentation
>>>>>>>>>   * -fprofile-instr-generate -fcoverage-mapping: FE + coverage
>>>>>>>>> instrumentation
>>>>>>>>>
>>>>>>>>> It's a bit ugly because the meaning of -fprofile-instr-generate
>>>>>>>>> becomes context-sensitive. But, (1) it doesn't break existing common
>>>>>>>>> workflows and (2) it makes it easier to ship IRPGO. The big caveat here is
>>>>>>>>> that we'll need to wait a bit and see if our internal users are OK with
>>>>>>>>> this.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Is there a reason to even have the possibility for FEPGO in the
>>>>>>>> long run? From what I can tell, at most we would add a
>>>>>>>> -fuse-the-old-pgo-because-i-want-to-merge-with-old-profiles option
>>>>>>>> to hold people over until they can regenerate their profiles with the
>>>>>>>> current compiler. We can add a flag to control what pre-instrumentation is
>>>>>>>> done to retain the source-level robustness of FEPGO (e.g.
>>>>>>>> -fpgo-no-simplify-before-instrumenting or something).
>>>>>>>>
>>>>>>>>
>>>>>>>>> One alternative is to introduce a separate driver flag for IRPGO.
>>>>>>>>> This might not work well for Sony's existing users. I'd be interested in
>>>>>>>>> any feedback about this approach.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Personally, I would prefer to maintaining command line
>>>>>>>> compatibility for PGO in Clang (i.e. users don't have to modify their build
>>>>>>>> systems).
>>>>>>>>
>>>>>>>>
>>>>>>>> -- Sean Silva
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> > I really don't like fragmenting things like this (e.g. if a
>>>>>>>>> third-party tests "clang's" PGO they will get something different depending
>>>>>>>>> on the platform), but I don't see another way given Apple's constraints.
>>>>>>>>> >
>>>>>>>>> > I'd like to see IRPGO to be the default as well, but the first
>>>>>>>>> thing we need is a driver level option to make the switch (prof-gen) --
>>>>>>>>> currently we rely on -Xclang option to switch between two modes, which is
>>>>>>>>> less than ideal.
>>>>>>>>> >
>>>>>>>>> > If the concern from Apple is that the old profile still need to
>>>>>>>>> work, then this is problem already solved. The reason is that
>>>>>>>>> -fprofile-instr-use can automatically detect the type of the profile and
>>>>>>>>> switch the mode.
>>>>>>>>>
>>>>>>>>> It's not just that. As Sean pointed out, we're concerned about old
>>>>>>>>> profiles inter-operating poorly with new ones.
>>>>>>>>>
>>>>>>>>> thanks,
>>>>>>>>> vedant
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> > - Pre-instrumentation passes
>>>>>>>>> >
>>>>>>>>> > Pre-instrumentation optimization has been critical for reducing
>>>>>>>>> the overhead of PGO for the PS4 games we tested (as expected). However, in
>>>>>>>>> our measurements (and we are glad to provide more info) the main benefit
>>>>>>>>> was inlining (also as expected). A simple pass of inlining at threshold 100
>>>>>>>>> appeared to give all the benefits. Even inlining at threshold 0 gave almost
>>>>>>>>> all the benefits. For example, the passes initially proposed in
>>>>>>>>> http://reviews.llvm.org/D15828did not improve over just inlining
>>>>>>>>> with threshold 100.
>>>>>>>>> >
>>>>>>>>> > (due to PR27299 we also need to add simplifycfg after inlining
>>>>>>>>> to clean up, but this doesn't affect the instrumentation overhead in our
>>>>>>>>> measurements)
>>>>>>>>> >
>>>>>>>>> > Bottom line: for our use cases, inlining does all the work, but
>>>>>>>>> we're not opposed to having more passes, which might be beneficial for
>>>>>>>>> non-game workloads (which is most code).
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> > Yes, Rong is re-collecting performance data before submitting
>>>>>>>>> the patch.
>>>>>>>>> >
>>>>>>>>> > - Warnings
>>>>>>>>> >
>>>>>>>>> > We identified 3 classes of issues which manifest as spammy
>>>>>>>>> warnings when applying profile data with IRPGO (these affect FEPGO also I
>>>>>>>>> believe, but we looked in depth at IRPGO):
>>>>>>>>> >
>>>>>>>>> > 1. The main concerning one is that getPGOFuncName mangles the
>>>>>>>>> filename into the counter name. This causes us to get
>>>>>>>>> instrprof_error::unknown_function when the pgo-use build is done in a
>>>>>>>>> different build directory from the training build (which is a reasonable
>>>>>>>>> thing to support). In this situation, PGO data is useless for all `static`
>>>>>>>>> functions (and as a byproduct results in a huge volume of warnings).
>>>>>>>>> >
>>>>>>>>> > This can be enhanced with an user option to override the
>>>>>>>>> behavior. Can you help filing a tracking bug?
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> > 2. In different TU's, pre-instr inlining might make different
>>>>>>>>> inlining decisions (for example, different functions may be available for
>>>>>>>>> inlining), causing hash mismatch errors (instrprof_error::hash_mismatch).
>>>>>>>>> In building a large game, we only saw 8 instance of this, so it is not as
>>>>>>>>> severe as 1, but would be good to fix.
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> > Rong has a patch addressing that -- will submit after cleanup
>>>>>>>>> pass change is done.
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> > 3. A .cpp file may be compiled and put into an archive, but then
>>>>>>>>> not selected by the linker and will therefore not result in a counter in
>>>>>>>>> the profraw. When compiling this file with pgo-use,
>>>>>>>>> instrprof_error::unknown_function will result and a warning will be emitted.
>>>>>>>>> >
>>>>>>>>> > yes -- this is a common problem to other compilers as well.
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> > Case 1 can be fixed using a function hash or other unique
>>>>>>>>> identifier instead of a file path. David, in D20195 you mentioned that Rong
>>>>>>>>> was working on a patch that would fix 2; we are looking forward to that.
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> > Right.
>>>>>>>>> >
>>>>>>>>> > For 3, I unfortunately do not know of any solution. I don't
>>>>>>>>> think there is a way for us to make this warning reliable in the face of
>>>>>>>>> this circumstance. So my conclusion is that
>>>>>>>>> instrprof_error::unknown_function at least must be defaulted to off
>>>>>>>>> unfortunately.
>>>>>>>>> >
>>>>>>>>> > yes, this can be annoying. If the warnings can be buffered, then
>>>>>>>>> the compiler can check if this is due to missing profile for the whole file
>>>>>>>>> and can reduce the warnings into one single warning (source file has no
>>>>>>>>> profile data).  Making it off by default sounds fine to me too if it is too
>>>>>>>>> noisy.
>>>>>>>>> >
>>>>>>>>> > thanks,
>>>>>>>>> >
>>>>>>>>> > David
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> > -- Sean Silva
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160623/d0aabec5/attachment-0001.html>


More information about the llvm-dev mailing list