[cfe-dev] RFC: Clang-Misexpect Proposal

Paul Kirth via cfe-dev cfe-dev at lists.llvm.org
Fri Aug 16 11:01:23 PDT 2019


On Thu, Aug 15, 2019 at 5:59 PM Vedant Kumar <vedant_kumar at apple.com> wrote:

>
>
> On Jul 24, 2019, at 7:00 PM, Paul Kirth via cfe-dev <
> cfe-dev at lists.llvm.org> wrote:
>
>
>
> On Wed, Jul 24, 2019 at 5:42 PM Roman Lebedev <lebedev.ri at gmail.com>
> wrote:
>
>> On Thu, Jul 25, 2019 at 3:24 AM Paul Kirth <paulkirth at google.com> wrote:
>> >
>> >
>> >
>> > On Wed, Jul 24, 2019 at 3:14 PM Roman Lebedev <lebedev.ri at gmail.com>
>> wrote:
>> >>
>> >> On Thu, Jul 25, 2019 at 1:06 AM Paul Kirth via cfe-dev
>> >> <cfe-dev at lists.llvm.org> wrote:
>> >> >
>> >> >
>> >> >
>> >> > On Wed, Jul 24, 2019 at 6:31 AM Michael Kruse <cfe-dev at meinersbur.de>
>> wrote:
>> >> >>
>> >> >> Sounds like a useful tool.
>> >> >>
>> >> >
>> >> > Thanks! I certainly hope it will be.
>> >> >
>> >> >>
>> >> >> Could it be enough for clang to emit warnings in PGO mode when it
>> >> >> detects a mismatch between collected profile data and
>> >> >> __builtin_expect() ?
>> >> >>
>> >> >
>> >> > You are right: it might be sufficient to simply gate these warnings
>> behind a flag in the PGO pipeline. I think there is a good argument for
>> doing it this way. In fact, it's how my prototype works right now, so we
>> have certainly considered this option.
>> >> >
>> >> > However, I think it will still be desirable to have a standalone
>> tool that doesn't require the full compilation pipeline. My understanding
>> was that clang-tools are often aimed at this exact use-case: i.e. a custom
>> driver and partial compilation pipeline.
>> >> >
>> >> > Another important point is that compiler warnings are, by and large,
>> accurate. Our approach to identifying these mismatches is subject to some
>> imprecision due to a deficiency in the collected profile.
>> >> Can those be simply reported as normal Remarks?
>> >> Those can later be visualized over the codebase via opt-viewer.
>> >>
>> >
>> > Could you clarify the context here for me? I'm not sure if you mean:
>> >
>> > 1. using remarks in place of warnings in the PGO pipeline, but not a
>> standalone tool.
>> > 2. using remarks in place of warnings in both a standalone tool and in
>> the PGO pipeline.
>> > 3. using remarks in place of warnings in the PGO pipeline, with no
>> standalone tool (so only a compiler flag).
>> >
>> > If we're only talking about case 1, then I think that could be a good
>> compromise w.r.t using warnings in the compiler.
>> >
>> > I'm not sure I see an advantage to using remarks in a standalone
>> tool(scenario 2), or as the sole method of reporting(scenario 3). I'm not
>> even sure if that was what you meant, but I'm happy to hear more about the
>> approach if I have missed something.
>>
>> I think i'm failing to catch the subtle difference between 1. and 3.
>> I was indeed suggesting to *not* introduce a separate tool,
>> just emit the Remark (when some specific new diag group is not disabled)
>> when the PGO profile is specified/being used by clang.
>>
>>
> Sorry for the lack of clarity.
>
> In 1, in addition to the new flag in clang you still would have a
> standalone tool that emits warnings -- which I think is OK since it isn't
> the compiler itself and has a narrow, well defined scope.
>
> In 3, it's just as you describe: only an extension to the compiler based
> on Remarks, without any standalone tool.
>
> In my view, using Remarks here doesn't feel like the right tradeoff. They
> are tremendously useful for compiler developers, but I don't think the
> average user of clang is familiar with them, or should be.
>
>
> As far as I'm aware, opt-remarks are actually intended for a general
> audience, not just compiler developers. I've cc'd some folks who've
> explored various ways to surface remarks / make them user-friendly. I
> actually think they're pretty user-friendly already -- e.g. there's support
> in tree for surfacing opt-remarks up via clang's DiagnosticEngine (i.e.
> they can look like warnings, get serialized, show up in an IDE, etc).
>
>
>
Thanks for the feedback and looping in some others into the discussion.
I've definitely been exploring the Remarks infrastructure as the discussion
has developed.


>
> My exposure to Remarks has been exclusively as a diagnostic tool to
> understand the reasoning behind which optimizations have occurred or to
> debug something within the compiler. Maybe this is an aspect of the
> toolchain I'm unfamiliar with, but I can't think of another time we
> interface with developer using Remarks rather than directly giving them
> feedback.
>
> I also worry that if this information is only available in Remarks, then
> it will never be used, since it is kind of hidden. On the other hand,
> having a standalone tool fits well with existing workflows, and makes it
> easy for developers to try out.
>
>
> If memory serves, remarks can be surfaced to the user by passing an extra
> compiler flag. That seems even easier than using a separate tool?
>
> Also, istm that a new standalone tool will start off 'hidden' too.
>
>
Fair point.


>
> Other than my misgivings about issuing warnings based on a heuristic, is
> there any advantage to using Remarks over warnings?
>
>
> I think so. Using remarks can expose users to other important performance
> hints from the compiler, beyond what is/can-be surfaced by a standalone
> misexpect tool.
>
>
That's true, but I still think there is a benefit to a standalone tool that
you can just run over a codebase through the compiler commands database. It
also has the benefit of not requiring any changes to the build other than a
way to generate a representative profile. I think that provides users an
easy way to check many configurations with only minimal effort. The richer
compiler infrastructure seems to be more well suited to deeper
investigations.


> That's why it seems to me that the remarks infrastructure is the perfect
> fit for the hints you are describing.
>

>

Maybe we can have both remarks and an optional warning? I agree that there
is a benefit to exposing this diagnostic as part of the remarks
infrastructure, and that it is a good fit for misexpect in many ways. I
think it would still be nice to give users a way to also turn this into a
warning.


> best,
> vedant
>
>
>
>> >>
>> >> > When we issue a warning in clang-misexpect, it means one of two
>> things: (i) your annotation is incorrect, or (ii) your profile data is
>> insufficient. While it at least signals to the developer that investigation
>> is warranted, it is a somewhat unsatisfactory feedback. I think our best
>> effort warnings are a better fit for a clang-tool because of that. To me,
>> warnings in the main compiler have a feeling of more significance and a
>> higher burden of correctness. For something less precise or not generally
>> applicable, a clang-tool appears to be the right place to implement this.
>> This seems to be consistent with existing tools, like clang-tidy, that
>> enable more warnings/diagnostics than are available in the main compiler.
>> >> >
>> >> > That being said, we're happy to take feedback from the community. So
>> if the general consensus is that this should be part of the PGO pipeline,
>> then we can rescope our proposal to only add a new flag to clang.
>> >> >
>> >> > Or maybe the right thing to do is to have both? I'm not exactly
>> sure, which is one of the reasons we made this RFC.
>> >> >
>> >> >>
>> >> >> Michael
>> >> >>
>> >> >>
>> >> >> Am Di., 23. Juli 2019 um 22:06 Uhr schrieb Paul Kirth via cfe-dev
>> >> >> <cfe-dev at lists.llvm.org>:
>> >> >> >
>> >> >> > We would like to propose adding a new clang based tool,
>> clang-mispredict, for identifying potentially incorrect uses of
>> __builtin_expect().
>> >> >> >
>> >> >> > --- TLDR ---
>> >> >> > What are we proposing?
>> >> >> > * Adding a new tool, clang-mispredict, to the LLVM project.
>> >> >> >
>> >> >> > What is its purpose?
>> >> >> > * Identifying potentially problematic uses of __builtin_expect().
>> >> >> >
>> >> >> > Why is this important?
>> >> >> > * __builtin_expect() has a high performance cost when it is used
>> incorrectly.
>> >> >> >
>> >> >> > Where will the code live?
>> >> >> > * clang-mispredict will live in clang-tools-extra, and there will
>> be some additions to clang.
>> >> >> >
>> >> >> > Open questions
>> >> >> > * What is the correct heuristic for determining a mismatch?
>> >> >> >
>> >> >> > Potential shortcomings
>> >> >> > * The approach is sensitive to both profiling input and
>> configuration.
>> >> >> >
>> >> >> >
>> >> >> > --- Details ---
>> >> >> >
>> >> >> > We plant to build a new tool, clang-mispredict, on top of Clang’s
>> PGO infrastructure. This tool will issue warnings about the use of
>> _builtin_expect() when collected PGO profile data shows a mismatch with the
>> use of __builtin_expect(). Existing solutions for validating these
>> annotations generally follow this approach: leverage dynamic profiling to
>> validate existing annotations. This is the approach used in the Linux
>> kernel, for example. Unfortunately, existing solutions seem to be custom
>> efforts specific to a particular code base. Supporting this in the LLVM
>> ecosystem gives developers a general way to check the use of
>> _builtin_expect() without creating custom instrumentation.
>> >> >> >
>> >> >> > Additionally, if _builtin_expect() is used incorrectly, then
>> developers may notice a performance regression when switching to the new
>> PM. This happens because the New PM follows performance annotations more
>> closely than the legacy PM, and is therefore more likely to try and
>> aggressively optimize these cases. Our proposed tool can help developers
>> narrow their focus to potentially problematic areas during the transition.
>> >> >> >
>> >> >> > Finally, it may also be beneficial to suggest annotations when
>> profiling suggests it would be beneficial. We consider this a desirable
>> property not only from a performance standpoint, but also for identifying
>> potential bugs (both in the profiling corpus and within the codebase).
>> >> >> >
>> >> >> > We think having clang-mispredict behave similar to clang-tidy is
>> good approach. We can use a compile commands database in conjunction with
>> profiling data to drive our validation tool. More concretely, we plan to
>> use Clang’s existing PGO infrastructure to emit warnings where branch
>> weights are assigned, but stop compilation before the IR is passed to the
>> backend. This largely amounts to checking if the profile counters are
>> outside of certain thresholds for 'hot' and 'cold' code. Similar to clang
>> tidy, we plan to support suppressing warnings within the codebase through
>> use of comment strings.
>> >> >> >
>> >> >> > The most obvious question about this approach is how to quantify
>> what is ‘hot’ or ‘cold’. Initially, we plan to use the thresholds already
>> present in LLVM; however, we are happy to use any suitable heuristic or
>> empirically derived threshold. Input from the community about the correct
>> heuristic are most welcome.
>> >> >> >
>> >> >> > Our strategy does have some shortcomings. The usefulness of the
>> warnings are directly related to how representative the profiling input
>> actually was when compared to its normal use. If the profile collected is
>> not representative of typical use, then the warnings may not reflect the
>> ground truth. Program and build configuration can also dramatically change
>> which paths will be executed, thus affecting the quality of the profile
>> data and the generated warnings.
>> >> >> >
>> >> >> > Ultimately, we are proposing adding a new standalone tool that
>> takes a compile commands database and an instrumentation profile to emit
>> warnings in a clang-tidy fashion.
>> >> >> >
>> >> >> > Please let us know if you have comments or concerns about this
>> proposal.
>> >> >> >
>> >> >> > Thanks!
>> >> >> >
>> >> >> > --
>> >> >> > Paul Kirth
>> >> >> > _______________________________________________
>> >> >> > cfe-dev mailing list
>> >> >> > cfe-dev at lists.llvm.org
>> >> >> > https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>> >> >
>> >> >
>> >> >
>> >> > --
>> >> > Paul Kirth
>> >> > _______________________________________________
>> >> > cfe-dev mailing list
>> >> > cfe-dev at lists.llvm.org
>> >> > https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>> >>
>> >> Roman
>> >
>> >
>> >
>> > --
>> > Paul Kirth
>>
>> Roman
>>
>
>
> --
> Paul Kirth
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
>
>

-- 
Paul Kirth
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20190816/ba2543ed/attachment.html>


More information about the cfe-dev mailing list