[llvm-dev] [GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try!
Quentin Colombet via llvm-dev
llvm-dev at lists.llvm.org
Mon Nov 27 09:44:42 PST 2017
Thanks all.
Amara, could you take a look?
> On Nov 20, 2017, at 3:06 AM, Oliver Stannard <oliver.stannard at arm.com> wrote:
>
> Hi Quentin,
>
> I’ve raised:
> https://bugs.llvm.org/show_bug.cgi?id=35359 <https://bugs.llvm.org/show_bug.cgi?id=35359>
> https://bugs.llvm.org/show_bug.cgi?id=35360 <https://bugs.llvm.org/show_bug.cgi?id=35360>
> https://bugs.llvm.org/show_bug.cgi?id=35361 <https://bugs.llvm.org/show_bug.cgi?id=35361>
>
> I also left the test suite running over the weekend in little-endian mode with the __fp16 type disabled, and it didn’t find any bugs there.
>
> Oliver
>
> From: qcolombet at apple.com [mailto:qcolombet at apple.com]
> Sent: 17 November 2017 17:28
> To: Oliver Stannard
> Cc: llvm-dev at lists.llvm.org; nd; Kristof Beyls
> Subject: Re: [llvm-dev] [GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try!
>
> Hi Oliver,
>
> Thanks for trying this.
> Could you file a different PR for each of the problem you found and reference the umbrella PR: http://llvm.org/PR35347? <http://llvm.org/PR35347?>
>
> Thanks,
> -Quentin
>
>
> On Nov 17, 2017, at 8:17 AM, Oliver Stannard <oliver.stannard at arm.com <mailto:oliver.stannard at arm.com>> wrote:
>
> Hi Quentin,
>
> One more reproducer, this time with small (<64bit) values being passed on the stack:
>
> int foo(int x0, int x1, int x2, int x3, int x4, int x5, int x6, int x7,
> int stack1) {
> return stack1;
> }
>
> int main() {
> int ret = foo(0,1,2,3,4,5,6,7,8);
> printf("%d\n", ret);
> }
>
> Global isel thinks that the incoming value of stack1 is stored in bytes [0,4) above SP, but for big-endian targets this should be in bytes [4,8):
>
> // /work/llvm/build/bin/clang --target=aarch64-arm-none-eabi -march=armv8-a -c callees.cpp -O0 -Wall -std=c++11 -mllvm -global-isel=true -mllvm -global-isel-abort=0 -mbig-endian -o - -S
> _Z3fooiiiiiiiii: // @_Z3fooiiiiiiiii
> // BB#0: // %entry
> sub sp, sp, #48 // =48
> ldr w8, [sp, #48] // <= Should be [sp, #52]
> str w0, [sp, #44]
> str w1, [sp, #40]
> str w2, [sp, #36]
> str w3, [sp, #32]
> str w4, [sp, #28]
> str w5, [sp, #24]
> str w6, [sp, #20]
> str w7, [sp, #16]
> str w8, [sp, #12]
> ldr w0, [sp, #12]
> add sp, sp, #48 // =48
> ret
>
> Oliver
>
> From: Oliver Stannard
> Sent: 17 November 2017 14:57
> To: 'qcolombet at apple.com <mailto:qcolombet at apple.com>'
> Cc: 'llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>'; nd; Kristof Beyls
> Subject: RE: [llvm-dev] [GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try!
>
> Hi Quentin,
>
> It seems that we also get the calling convention wrong for vector types on big-endian:
> #include <arm_neon.h>
> int32x2_t load_vector(int32x2_t *p) {
> return *p;
> }
>
> Global-isel generates this:
> // armclang --target=aarch64-arm-none-eabi -march=armv8-a -c callees.cpp -O0 -Wall -std=c++11 -mllvm -global-isel=true -mllvm -global-isel-abort=0 -mbig-endian -o - -S
> _Z11load_vectorP11__Int32x2_t: // @_Z11load_vectorP11__Int32x2_t
> // BB#0: // %entry
> sub sp, sp, #16 // =16
> str x0, [sp, #8]
> ldr x0, [sp, #8]
> ld1 { v0.2s }, [x0]
> add sp, sp, #16 // =16
> ret
>
> With global-isel off, there is a rev64 instruction between the ld1 and the add, which fixes up the endianness of the vector.
>
> Oliver
>
> From: Oliver Stannard
> Sent: 17 November 2017 13:32
> To: 'qcolombet at apple.com <mailto:qcolombet at apple.com>'
> Cc: llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>; nd; Kristof Beyls
> Subject: RE: [llvm-dev] [GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try!
>
> Hi Quentin,
>
> At Kristof’s suggestion, I tried running our ABI test suite for a big-endian AArch64 target, and this found an ABI mismatch between global-isel and regular -O0. Here’s a reproducer for the first one I’ve investigated:
>
> struct foo {
> float first;
> float second;
> };
> float get_first(foo p) {
> return p.first;
> }
>
> This is the code that global-isel currently generates:
> // /work/llvm/build/bin/clang --target=aarch64--none-eabi -march=armv8-a -c callees.cpp -O0 -mllvm -global-isel=true -mllvm -global-isel-abort=0 -mbig-endian -o - -S
>
> _Z9get_first3foo: // @_Z9get_first3foo
> // BB#0: // %entry
> sub sp, sp, #16 // =16
> // implicit-def: %X8
> fmov w9, s0
> mov w10, w9
> bfxil x8, x10, #0, #32
> fmov w9, s1
> mov w10, w9
> bfi x8, x10, #32, #32
> add x10, sp, #8 // =8
> str x8, [sp, #8]
> ldr w9, [x10]
> fmov s0, w9
> add sp, sp, #16 // =16
> ret
>
> When run on a big-endian target, this incorrectly returns the second member of the struct, instead of the first.
>
> Oliver
>
> From: qcolombet at apple.com <mailto:qcolombet at apple.com> [mailto:qcolombet at apple.com <mailto:qcolombet at apple.com>]
> Sent: 14 November 2017 23:11
> To: Quentin Colombet
> Cc: Oliver Stannard; llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>; Justin Bogner; Ahmed Bougacha; Aditya Nandakumar; nd
> Subject: Re: [llvm-dev] [GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try!
>
> To give an update here, we actually are not missing a mapping. The code complains because we are copying around a fp16 into a gpr32 and that shouldn’t be done with a copy (default mapping).
> I extended the repairing code to issue G_ANYEXT in those cases instead of asserting.
>
> However, now, I have to teach instruction select about those ANYEXT otherwise we’ll fallback in that case. But that’s a different story.
>
> I’ll try to commit today or tomorrow (I have to strengthen the tests).
>
> On Nov 14, 2017, at 9:29 AM, Quentin Colombet via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>
> Thanks Oliver.
> I’ll have a look. This typically means that we miss a mapping for this type/instruction, which is not surprising given how little code we have we fp16.
>
>
> On Nov 14, 2017, at 2:27 AM, Oliver Stannard <oliver.stannard at arm.com <mailto:oliver.stannard at arm.com>> wrote:
>
> Hi Quentin,
>
> I’ve started running an ABI test suite with global isel on AArch64, and while it hasn’t found any ABI issues it has hit an assertion in clang when using the __fp16 type. Here’s a reproducer:
>
> __fp16 pass_f16(__fp16 p) {
> return p;
> }
>
> $ /work/llvm/build/bin/clang --target=aarch64-arm-none-eabi -march=armv8-a -c test.c -O0 -mllvm -global-isel -mllvm -global-isel-abort=0
> clang-6.0: /work/llvm/llvm/lib/CodeGen/GlobalISel/RegisterBankInfo.cpp:446: static void llvm::RegisterBankInfo::applyDefaultMapping(const llvm::RegisterBankInfo::OperandsMapper &): Assertion `OrigTy.getSizeInBits() == NewTy.getSizeInBits() && "Types with difference size cannot be handled by the default " "mapping"' failed.
> #0 0x000000000362a764 PrintStackTraceSignalHandler(void*) (/work/llvm/build/bin/clang-6.0+0x362a764)
> #1 0x000000000362aac6 SignalHandler(int) (/work/llvm/build/bin/clang-6.0+0x362aac6)
> #2 0x00007f9193b78330 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x10330)
> #3 0x00007f919276bc37 gsignal /build/eglibc-oGUzwX/eglibc-2.19/signal/../nptl/sysdeps/unix/sysv/linux/raise.c:56:0
> #4 0x00007f919276f028 abort /build/eglibc-oGUzwX/eglibc-2.19/stdlib/abort.c:91:0
> #5 0x00007f9192764bf6 __assert_fail_base /build/eglibc-oGUzwX/eglibc-2.19/assert/assert.c:92:0
> #6 0x00007f9192764ca2 (/lib/x86_64-linux-gnu/libc.so.6+0x2fca2)
> #7 0x0000000003d70eb9 (/work/llvm/build/bin/clang-6.0+0x3d70eb9)
> #8 0x0000000003d6b00c llvm::RegBankSelect::applyMapping(llvm::MachineInstr&, llvm::RegisterBankInfo::InstructionMapping const&, llvm::SmallVectorImpl<llvm::RegBankSelect::RepairingPlacement>&) (/work/llvm/build/bin/clang-6.0+0x3d6b00c)
> #9 0x0000000003d6b366 llvm::RegBankSelect::assignInstr(llvm::MachineInstr&) (/work/llvm/build/bin/clang-6.0+0x3d6b366)
> #10 0x0000000003d6b7f1 llvm::RegBankSelect::runOnMachineFunction(llvm::MachineFunction&) (/work/llvm/build/bin/clang-6.0+0x3d6b7f1)
> #11 0x0000000002d934c8 llvm::MachineFunctionPass::runOnFunction(llvm::Function&) (/work/llvm/build/bin/clang-6.0+0x2d934c8)
> #12 0x00000000030c998f llvm::FPPassManager::runOnFunction(llvm::Function&) (/work/llvm/build/bin/clang-6.0+0x30c998f)
> #13 0x00000000030c9c53 llvm::FPPassManager::runOnModule(llvm::Module&) (/work/llvm/build/bin/clang-6.0+0x30c9c53)
> #14 0x00000000030ca136 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/work/llvm/build/bin/clang-6.0+0x30ca136)
> #15 0x00000000037c3dcf clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::DataLayout const&, llvm::Module*, clang::BackendAction, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream> >) (/work/llvm/build/bin/clang-6.0+0x37c3dcf)
> #16 0x0000000003d421a0 clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) (/work/llvm/build/bin/clang-6.0+0x3d421a0)
> #17 0x0000000004457376 clang::ParseAST(clang::Sema&, bool, bool) (/work/llvm/build/bin/clang-6.0+0x4457376)
> #18 0x0000000003ca6ea0 clang::FrontendAction::Execute() (/work/llvm/build/bin/clang-6.0+0x3ca6ea0)
> #19 0x0000000003c1fa31 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/work/llvm/build/bin/clang-6.0+0x3c1fa31)
> #20 0x0000000003d3bf4b clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/work/llvm/build/bin/clang-6.0+0x3d3bf4b)
> #21 0x0000000001f85629 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/work/llvm/build/bin/clang-6.0+0x1f85629)
> #22 0x0000000001f83096 main (/work/llvm/build/bin/clang-6.0+0x1f83096)
> #23 0x00007f9192756f45 __libc_start_main /build/eglibc-oGUzwX/eglibc-2.19/csu/libc-start.c:321:0
> #24 0x0000000001f80029 _start (/work/llvm/build/bin/clang-6.0+0x1f80029)
> Stack dump:
> 0. Program arguments: /work/llvm/build/bin/clang-6.0 -cc1 -triple aarch64-arm-none-eabi -emit-obj -mrelax-all -disable-free -main-file-name test.c -mrelocation-model static -mthread-model posix -mdisable-fp-elim -fmath-errno -masm-verbose -mconstructor-aliases -fuse-init-array -target-cpu generic -target-feature +neon -target-abi aapcs -fallow-half-arguments-and-returns -dwarf-column-info -debugger-tuning=gdb -coverage-notes-file /work/innovation/cctest/test.gcno -resource-dir /work/llvm/build/lib/clang/6.0.0 -O0 -fdebug-compilation-dir /work/innovation/cctest -ferror-limit 19 -fmessage-length 226 -fno-signed-char -fobjc-runtime=gcc -fdiagnostics-show-option -fcolor-diagnostics -mllvm -global-isel -mllvm -global-isel-abort=0 -o test.o -x c test.c
> 1. <eof> parser at end of file
> 2. Code generation
> 3. Running pass 'Function Pass Manager' on module 'test.c'.
> 4. Running pass 'RegBankSelect' on function '@pass_f16'
> clang-6.0: error: unable to execute command: Aborted (core dumped)
> clang-6.0: error: clang frontend command failed due to signal (use -v to see invocation)
> clang version 6.0.0 (ssh://olista01@ds-gerrit.euhpc.arm.com:29418/armcompiler/clang <ssh://olista01@ds-gerrit.euhpc.arm.com:29418/armcompiler/clang> aa2b9952ef98a5fe2d47384ef17106855b8bae51) (ssh://olista01@ds-gerrit.euhpc.arm.com:29418/armcompiler/llvm <ssh://olista01@ds-gerrit.euhpc.arm.com:29418/armcompiler/llvm> 29f89772107a79b5f2a816d4748ed9c19416c1b6)
> Target: aarch64-arm-none-eabi
> Thread model: posix
> InstalledDir: /work/llvm/build/bin
> clang-6.0: note: diagnostic msg: PLEASE submit a bug report to http://llvm.org/bugs/ <http://llvm.org/bugs/> and include the crash backtrace, preprocessed source, and associated run script.
> clang-6.0: note: diagnostic msg:
> ********************
>
> PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
> Preprocessed source(s) and associated run script(s) are located at:
> clang-6.0: note: diagnostic msg: /tmp/test-e06964.c
> clang-6.0: note: diagnostic msg: /tmp/test-e06964.sh
> clang-6.0: note: diagnostic msg:
>
> ********************
>
> Oliver
>
> From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org <mailto:llvm-dev-bounces at lists.llvm.org>] On Behalf Of Quentin Colombet via llvm-dev
> Sent: 13 November 2017 18:27
> To: Kristof Beyls
> Cc: llvm-dev; nd; Ahmed Bougacha; Justin Bogner; Aditya Nandakumar
> Subject: Re: [llvm-dev] [GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try!
>
> Hi Kristof,
>
>
> On Nov 13, 2017, at 9:10 AM, Kristof Beyls <Kristof.Beyls at arm.com <mailto:Kristof.Beyls at arm.com>> wrote:
>
> Hi Quentin,
>
> My only remaining concern is around ABI compatibility.
> The following commit seems to indicate that in the previous round of evaluation, we didn’t find an existing ABI compatibility issue:
> http://llvm.org/viewvc/llvm-project?view=revision&revision=311388 <http://llvm.org/viewvc/llvm-project?view=revision&revision=311388>.
> I haven’t looked into the details of this issue - so maybe I’m worried over nothing?
>
> No, you’re right. The problem with ABI is if you are consistently wrong, then you won’t notice :).
>
>
>
> I’m wondering if since then on your side you did any testing around ABI compatibility?
> E.g. building software where you semi-randomly build some functions through GlobalISel and some functions through DAGISel?
>
> Justin will look into that. Clang has utility script for that utils/ABITest.
>
> Given we will only be able to check iOS ABI, you may want to follow the same kind of validation on your side.
>
> I let you sync up with Justin for the method.
>
> Cheers,
> -Quentin
>
>
>
> Thanks,
>
> Kristof
>
> On 8 Nov 2017, at 00:42, Quentin Colombet via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>
> Hi all,
>
> I’d like to resurrect this thread and ask if people are on board for enabling this by default for AArch64 O0.
>
>
> *** What Changed Since June? ***
>
> - We added a way to describe the legalization actions for non-power-of-2
> - We gave a tutorial that covers the best practices to target GlobalISel
> - We improved the TableGen backend to reuse existing SDISel patterns
> - We built and ran huge internal software with GISel
> - We evaluated the performance of GISel and are confident things are in a good shape (with https://reviews.llvm.org/D39034 <https://reviews.llvm.org/D39034>) and moving forward would look even better (see the last LLVM Dev talk: GlobalISel: Present, Past, and Future when it is available)
>
>
> *** So What’s he Plan? ***
>
> - Switch the default instruction selector to GISel for AArch64 at O0
> - Enable the fallback path by default for AArch64 (with warnings enabled when that path is hit)
> - Provide a clang option to turn GISel off
>
> What do you think?
>
> Thanks,
> -Quentin
>
>
> On Jun 16, 2017, at 4:43 PM, Quentin Colombet <qcolombet at apple.com <mailto:qcolombet at apple.com>> wrote:
>
> Hi all,
>
> We had some internal discussions about flipping the default for O0 and we concluded that we wanted to postpone it.
>
>
> *** Why Is That? ***
>
> We don’t want to send the wrong message that GlobalISel’s design is set in stone and ready for broader adoption.
> In particular,
> 1. The APIs are still evolving and can still possibly change significantly
> 2. The TableGen backend to reuse the existing SD patterns is still at its early stage
> 3. We want to investigate closely the performance of global-isel (compile-time, runtime, code size, fallbacks)
>
> The rationale behind those items is that we want to minimize the pain of moving forward for everybody. We also want the out-of-the-box experience to be pleasant (like all/most of the tablegen patterns just work, we have documentation on how to target a new backend, etc.) Finally, we want to gain confidence we are going to be able to address the performance issues we have with the current design and if not, derive a plan for that.
>
> We purposely left out of the conversation what will be the right time and requirements to flip the switch. We want to gather more data first. Your help would be appreciated!
>
>
> *** Short-Term Proposal ***
>
> What we would like to do instead short-term is:
> A. Repurpose or create an option “-aarch64-enable-global-isel-at-O” to enable GISel with fallbacks and warnings enables (i.e., equivalent of -global-isel -global-isel-abort=2)
> B. Advertise this option in the next open source release to allow compiler enthusiastic to try it and report problems
> C. Have GISel always built so we can push thing in the right place, MachineVerifier in mind, and stop doing some weird gymnastic
>
> What do people think?
>
>
> *** Your Help Is Needed ***
>
> - Please share your experience in using the GISel APIs and how we can make them better. Moving forward we’ll have those conversations on open source instead of internally/with a narrower audience.
> - Report any performance problem you identify
> - Propose patches!
>
> Cheers,
> -Quentin
>
>
>
> On Jun 16, 2017, at 3:06 PM, Quentin Colombet via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>
>
> On Jun 14, 2017, at 7:27 AM, Diana Picus <diana.picus at linaro.org <mailto:diana.picus at linaro.org>> wrote:
>
> On 12 June 2017 at 18:54, Diana Picus <diana.picus at linaro.org <mailto:diana.picus at linaro.org>> wrote:
> Hi all,
>
> I added a buildbot [1] running the test-suite with -O0 -global-isel. It runs into the same 2 timeouts that I reported previously on this thread (paq8p and scimark2). It would be nice to make it green before flipping the switch.
>
>
> I did some more investigations on a machine similar to the one running the buildbot. For paq8p and scimark2, I get these results for O0:
>
> PAQ8p:
> Fast isel: 666.344
> Global isel: 731.384
>
> SciMark2-C:
> Fast isel: 463.908
> Global isel: 496.22
>
> The current timeout is 500s (so in this particular case we didn't hit it for scimark2, and it ran successfully to completion). I don't think the difference between FastISel and GlobalISel is too atrocious, so I would propose increasing the timeout for these 2 benchmarks. I'm not sure if we can do this on a per-bot basis, but I see some precedent for setting custom timeout thresholds for various benchmarks on different architectures (sometimes with comments that it's done so we can run O0 on that particular benchmark).
>
> Something along these lines works:
> https://reviews.llvm.org/differential/diff/102547/ <https://reviews.llvm.org/differential/diff/102547/>
>
> What do you guys think about this approach?
>
> Looks reasonable to me.
>
>
>
> Thanks,
> Diana
>
> PS: The buildbot is using the Makefiles because that's what our other AArch64 test-suite bots use. Moving all of them to CMake is a transition for another time.
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171127/6c0de432/attachment-0001.html>
More information about the llvm-dev
mailing list