[llvm-dev] New pass manager for optimization pipeline status and questions

Alina Sbirlea via llvm-dev llvm-dev at lists.llvm.org
Tue Jul 28 10:23:30 PDT 2020


On Fri, Jul 24, 2020 at 12:54 PM Sjoerd Meijer <Sjoerd.Meijer at arm.com>
wrote:

> Hi Alina,
>
> I think this is an excellent direction, this is the direction we should
> take here.  Just a somewhat irrelevant disagreement on this though:
>
> > Philip's point is spot on that we are deficient now in the testing of
> the LegacyPassManager,
>
> I disagree because the LPM is still the default and I appreciated Hans'
> reply: "Defaults tend to be popular". But this is the direction I like:
>
> > This means prioritizing those blockers over other LLVM work. The current
> umbrella bug is PR46649 <https://bugs.llvm.org/show_bug.cgi?id=46649>.
>
> Just checking: do you accept both performance and code-size regressions as
> blockers here?
>

Yes, I think big performance and code-size regressions need to be
investigated. It will be hard to quantify what "big" is though, but we
should definitely track such regressions.
Due to the time-boxed approach we may need to discuss moving forward with
the switch with some regressions deemed acceptable, but those considered
blockers should be prioritized of course.


>
> > My point is that we want and should work with users to make the
> transition smooth, but we do very much need user (meaning companies using
> LLVM) involvement here in order to not delay the switch further.
>
> That's clear, and agreed.
> I would like to remark here that currently, when a commit regresses one
> benchmark that is important for someone, that is enough justification most
> of the time for a revert of that commit. That's why I surprised that it
> looked like we were not setting code-quality goals and requirements before
> switching. And what I would like to ask here is to provide reasonable
> enough time for people to look into switching to the NPM, to evaluate this,
> and then file bugs under PR46649. Just collecting data, evaluating
> problems, filings bugs can already time-consuming, and then I guess they
> need fixing too. This also needs to fit in people's plans right now.
>
> But it sounds reasonable to me that this is time-boxed. Given that
> switching is quite some work I think, switching before the clang-12 release
> would be unreasonable, and if clang-13 is in half a year from now, that
> already sounds perhaps somewhat reasonable, but might be tight.
>

I think it's reasonable to add a firm switch before clang-13 and before the
end of this year, with intermediary milestones (e.g. filing blockers for
user regressions in the next 1-2 months).
I'm inclined to favor a tighter deadline, the motivation here being to
ensure that working on potential blockers is prioritized with plenty of
time to spare, so the switch remains time-boxed.

Best,
Alina


> Thanks,
> Sjoerd.
>
> ------------------------------
> *From:* Alina Sbirlea <asbirlea at google.com>
> *Sent:* 24 July 2020 19:51
> *To:* Sjoerd Meijer <Sjoerd.Meijer at arm.com>
> *Cc:* Philip Reames <listmail at philipreames.com>; Chandler Carruth <
> chandlerc at gmail.com>; Eric Christopher <echristo at gmail.com>; llvm-dev <
> llvm-dev at lists.llvm.org>
> *Subject:* Re: [llvm-dev] New pass manager for optimization pipeline
> status and questions
>
> Hi all,
>
> The current plan is to prioritize enabling the NPM as soon as possible,
> and that includes addressing any blockers that are known or arise. This
> means prioritizing those blockers over other LLVM work. The current
> umbrella bug is PR46649 <https://bugs.llvm.org/show_bug.cgi?id=46649>.
>
> Philip's point is spot on that we are deficient now in the testing of the
> LegacyPassManager, because so many have already made the switch (FWIW
> Google switched more than 2 years ago).
>
> It's not constructive for the LLVM community to just flip the switch and
> break current LPM users. The purpose of these communications to llvm-dev
> and the bug tracking is to be informative as to the planned direction and
> make as quick of a progress as possible.
> Please keep in mind that the work on the NPM has been going on for many
> years and many customers have switched years ago, and delaying this for
> even an additional year is not acceptable for the code health and stability
> of LLVM.
>
> My point is that we want and should work with users to make the transition
> smooth, but we do very much need user (meaning companies using LLVM)
> involvement here in order to not delay the switch further.
>
> Best,
> Alina
>
>
> On Thu, Jul 23, 2020 at 2:59 AM Sjoerd Meijer <Sjoerd.Meijer at arm.com>
> wrote:
>
> I am not in favour of just flipping the switch and then deal with all the
> fall-out, because we see major regressions that would be unacceptable for
> our users. Thus, not only would this be very disruptive, also our releases
> are based on a certain trunk versions, so we would need to revert back to
> the legacy PM downstream and thus diverge from upstream which wouldn't be
> ideal for us. About the regressions, see the message/thread that I kicked
> off earlier (
> http://lists.llvm.org/pipermail/llvm-dev/2020-July/143646.html) which was
> quickly followed up by this thread.
>
> I would like to see here if we are interesting in defining a few criteria
> that must be met before we switch:
>
>    1. Correctness, which obviously always must come first: looks like
>    this is covered by bots that are running with the NPM, and by downstream
>    users. From the latest messages, I am getting we are there, or nearly there.
>    2. Performance (i.e. optimising for speed),
>    3. Code-size.
>
> With 1) correctness box covered and ticked, is now the time to look at
> codegen quality: 2) performance and 3) code-size? Would it be reasonable
> that we create a plan or timeline to address this, and thus allow time that
> these issues can be addressed?
>
> We are now ready to start tuning the NPM for code-size. Perhaps we are
> late to the NPM party (but that was a priority and bandwidth issue), but
> perhaps with correctness fixed this is actually the right time. I only ran
> numbers for code-size, and haven't even looked at performance numbers yet,
> which we would also need to do and takes time.
>
> Cheers,
> Sjoerd.
>
>
> ------------------------------
> *From:* llvm-dev <llvm-dev-bounces at lists.llvm.org> on behalf of Eric
> Christopher via llvm-dev <llvm-dev at lists.llvm.org>
> *Sent:* 23 July 2020 02:05
> *To:* Philip Reames <listmail at philipreames.com>; Alina Sbirlea <
> asbirlea at google.com>; Chandler Carruth <chandlerc at gmail.com>
> *Cc:* llvm-dev <llvm-dev at lists.llvm.org>
> *Subject:* Re: [llvm-dev] New pass manager for optimization pipeline
> status and questions
>
> FWIW I'm in favor of this direction while making sure that we keep focus
> on removing the vestiges of the old pass manager for the code health impact
> to the project.
>
> -eric
>
> On Wed, Jul 22, 2020 at 3:15 PM Philip Reames via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
> (I'm probably going to derail your thread, sorry about that.)
>
> I think at this point, we should just bite the bullet and make the switch
> to NPM by default for Clang's optimization pipeline.  Today.
>
> Why?  Because many of our downstream consumers have already switched.
> Google has.  We (Azul) have.  I think I've heard the same for a couple
> other major contributors.  Why does this matter?  Testing.  At the current
> moment, the vast majority of testing the project gets is exercising NPM,
> not LPM.
>
> NPM is functionally complete for Clang optimization.  There might be a few
> missing cases around the sanitizers, but last I heard those were on the
> edge of being fixed.
>
> I think we should make the switch, and deal with any fall out as
> regressions.  If we made the change immediately after a release branch,
> we'd have several months to address any major issues before the next
> release.
>
> Philip
> On 7/22/20 2:39 PM, Arthur Eubanks via llvm-dev wrote:
>
> Hi all,
>
> I wanted to give a quick update on the status of NPM for the IR
> optimization pipeline and ask some questions.
>
> In the past I believe there were thoughts that NPM was basically ready
> because all of check-llvm and check-clang passed when
> -DENABLE_EXPERIMENTAL_NEW_PASS_MANAGER=ON was specified. But that CMake
> flag did not apply to opt and any tests running something like `opt
> -foo-pass -bar-pass` (which is the vast majority of check-llvm tests) were
> still using the legacy PM. The intended way to use NPM was to use the
> -passes flag, e.g. `opt -passes='foo,bar'`.
>
> I've added a -enable-new-pm flag to opt to force running NPM passes even
> when `opt -foo-pass` is used. This is because I didn't want to go through
> every single test and figure out which ones should be using both -foo-pass
> and -passes=foo. Switching on -enable-new-pm currently leads to ~1800
> check-llvm failures. I've documented the failed tests count per directory
> in https://bugs.llvm.org/show_bug.cgi?id=46651 (some have been fixed
> since that was posted).
>
> This has led to real bugs in NPM being discovered and fixed (e.g. some
> optnone issues).
>
> But a large portion of the remaining failures are because codegen-only
> passes haven't been ported to NPM yet. That's fine for the optimization
> pipeline NPM transition since it doesn't affect the optimization pipeline,
> but it does present an issue with the approach of the -enable-new-pm flag
> (which would by default become true alongside the NPM transition). Lots of
> tests are testing codegen-specific passes via opt (e.g. `opt
> -amdgpu-lower-intrinsics`) and they can't use NPM (yet).
>
> I was thinking either we have a way of identifying codegen-only passes and
> revert back to the legacy PM in opt whenever we see one, or we go back to
> considering the originally intended approach of adding an equivalent
> `-passes=` RUN to all tests that should be also running under NPM.
>
> I'm not sure of a nice and clean solution to identify codegen-only passes.
> We could go and update every instance of INITIALIZE_PASS to take another
> parameter indicating if it's codegen-only. Or we could just have a central
> list somewhere where we check if the pass is in some hardcoded list or has
> some prefix (e.g. "x86-").
>
> The approach of adding equivalent `-passes=` RUN lines to all relevant
> tests seems daunting, but not exactly sure how daunting. Maybe it's
> possible to script something and see what fails? We'd still need some way
> to identify codegen-only passes to make sure we don't miss anything, and
> we'd need to distinguish between analyses and normal passes. Also, it would
> slow down test execution since we'd run a lot more tests twice, but maybe
> that's not such a big deal? Maybe it's good to have most tests running
> against the legacy PM even when NPM is on by default?
>
> Thoughts?
>
> This is split off from
> http://lists.llvm.org/pipermail/llvm-dev/2020-July/143395.html.
>
>
>
> _______________________________________________
> LLVM Developers mailing listllvm-dev at lists.llvm.orghttps://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200728/7808a2c4/attachment-0001.html>


More information about the llvm-dev mailing list