[llvm-dev] [GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try!
Amara Emerson via llvm-dev
llvm-dev at lists.llvm.org
Mon Dec 11 09:08:36 PST 2017
As of r320388 we’ve either fixed the blocker bugs or disabled big-endian on GISel, falling back to SDAG. Fixing at least one of the big-endian issues will need use to change the way we handle aggregates, which will take a bit longer (it’s next on my list of things to do).
Do we have any other issues preventing us flipping the switch?
Amara
> On Nov 27, 2017, at 5:44 PM, Quentin Colombet <qcolombet at apple.com> wrote:
>
> Thanks all.
>
> Amara, could you take a look?
>
>> On Nov 20, 2017, at 3:06 AM, Oliver Stannard <oliver.stannard at arm.com <mailto:oliver.stannard at arm.com>> wrote:
>>
>> Hi Quentin,
>>
>> I’ve raised:
>> https://bugs.llvm.org/show_bug.cgi?id=35359 <https://bugs.llvm.org/show_bug.cgi?id=35359>
>> https://bugs.llvm.org/show_bug.cgi?id=35360 <https://bugs.llvm.org/show_bug.cgi?id=35360>
>> https://bugs.llvm.org/show_bug.cgi?id=35361 <https://bugs.llvm.org/show_bug.cgi?id=35361>
>>
>> I also left the test suite running over the weekend in little-endian mode with the __fp16 type disabled, and it didn’t find any bugs there.
>>
>> Oliver
>>
>> From: qcolombet at apple.com <mailto:qcolombet at apple.com> [mailto:qcolombet at apple.com <mailto:qcolombet at apple.com>]
>> Sent: 17 November 2017 17:28
>> To: Oliver Stannard
>> Cc: llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>; nd; Kristof Beyls
>> Subject: Re: [llvm-dev] [GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try!
>>
>> Hi Oliver,
>>
>> Thanks for trying this.
>> Could you file a different PR for each of the problem you found and reference the umbrella PR: http://llvm.org/PR35347? <http://llvm.org/PR35347?>
>>
>> Thanks,
>> -Quentin
>>
>>
>> On Nov 17, 2017, at 8:17 AM, Oliver Stannard <oliver.stannard at arm.com <mailto:oliver.stannard at arm.com>> wrote:
>>
>> Hi Quentin,
>>
>> One more reproducer, this time with small (<64bit) values being passed on the stack:
>>
>> int foo(int x0, int x1, int x2, int x3, int x4, int x5, int x6, int x7,
>> int stack1) {
>> return stack1;
>> }
>>
>> int main() {
>> int ret = foo(0,1,2,3,4,5,6,7,8);
>> printf("%d\n", ret);
>> }
>>
>> Global isel thinks that the incoming value of stack1 is stored in bytes [0,4) above SP, but for big-endian targets this should be in bytes [4,8):
>>
>> // /work/llvm/build/bin/clang --target=aarch64-arm-none-eabi -march=armv8-a -c callees.cpp -O0 -Wall -std=c++11 -mllvm -global-isel=true -mllvm -global-isel-abort=0 -mbig-endian -o - -S
>> _Z3fooiiiiiiiii: // @_Z3fooiiiiiiiii
>> // BB#0: // %entry
>> sub sp, sp, #48 // =48
>> ldr w8, [sp, #48] // <= Should be [sp, #52]
>> str w0, [sp, #44]
>> str w1, [sp, #40]
>> str w2, [sp, #36]
>> str w3, [sp, #32]
>> str w4, [sp, #28]
>> str w5, [sp, #24]
>> str w6, [sp, #20]
>> str w7, [sp, #16]
>> str w8, [sp, #12]
>> ldr w0, [sp, #12]
>> add sp, sp, #48 // =48
>> ret
>>
>> Oliver
>>
>> From: Oliver Stannard
>> Sent: 17 November 2017 14:57
>> To: 'qcolombet at apple.com <mailto:qcolombet at apple.com>'
>> Cc: 'llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>'; nd; Kristof Beyls
>> Subject: RE: [llvm-dev] [GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try!
>>
>> Hi Quentin,
>>
>> It seems that we also get the calling convention wrong for vector types on big-endian:
>> #include <arm_neon.h>
>> int32x2_t load_vector(int32x2_t *p) {
>> return *p;
>> }
>>
>> Global-isel generates this:
>> // armclang --target=aarch64-arm-none-eabi -march=armv8-a -c callees.cpp -O0 -Wall -std=c++11 -mllvm -global-isel=true -mllvm -global-isel-abort=0 -mbig-endian -o - -S
>> _Z11load_vectorP11__Int32x2_t: // @_Z11load_vectorP11__Int32x2_t
>> // BB#0: // %entry
>> sub sp, sp, #16 // =16
>> str x0, [sp, #8]
>> ldr x0, [sp, #8]
>> ld1 { v0.2s }, [x0]
>> add sp, sp, #16 // =16
>> ret
>>
>> With global-isel off, there is a rev64 instruction between the ld1 and the add, which fixes up the endianness of the vector.
>>
>> Oliver
>>
>> From: Oliver Stannard
>> Sent: 17 November 2017 13:32
>> To: 'qcolombet at apple.com <mailto:qcolombet at apple.com>'
>> Cc: llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>; nd; Kristof Beyls
>> Subject: RE: [llvm-dev] [GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try!
>>
>> Hi Quentin,
>>
>> At Kristof’s suggestion, I tried running our ABI test suite for a big-endian AArch64 target, and this found an ABI mismatch between global-isel and regular -O0. Here’s a reproducer for the first one I’ve investigated:
>>
>> struct foo {
>> float first;
>> float second;
>> };
>> float get_first(foo p) {
>> return p.first;
>> }
>>
>> This is the code that global-isel currently generates:
>> // /work/llvm/build/bin/clang --target=aarch64--none-eabi -march=armv8-a -c callees.cpp -O0 -mllvm -global-isel=true -mllvm -global-isel-abort=0 -mbig-endian -o - -S
>>
>> _Z9get_first3foo: // @_Z9get_first3foo
>> // BB#0: // %entry
>> sub sp, sp, #16 // =16
>> // implicit-def: %X8
>> fmov w9, s0
>> mov w10, w9
>> bfxil x8, x10, #0, #32
>> fmov w9, s1
>> mov w10, w9
>> bfi x8, x10, #32, #32
>> add x10, sp, #8 // =8
>> str x8, [sp, #8]
>> ldr w9, [x10]
>> fmov s0, w9
>> add sp, sp, #16 // =16
>> ret
>>
>> When run on a big-endian target, this incorrectly returns the second member of the struct, instead of the first.
>>
>> Oliver
>>
>> From: qcolombet at apple.com <mailto:qcolombet at apple.com> [mailto:qcolombet at apple.com <mailto:qcolombet at apple.com>]
>> Sent: 14 November 2017 23:11
>> To: Quentin Colombet
>> Cc: Oliver Stannard; llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>; Justin Bogner; Ahmed Bougacha; Aditya Nandakumar; nd
>> Subject: Re: [llvm-dev] [GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try!
>>
>> To give an update here, we actually are not missing a mapping. The code complains because we are copying around a fp16 into a gpr32 and that shouldn’t be done with a copy (default mapping).
>> I extended the repairing code to issue G_ANYEXT in those cases instead of asserting.
>>
>> However, now, I have to teach instruction select about those ANYEXT otherwise we’ll fallback in that case. But that’s a different story.
>>
>> I’ll try to commit today or tomorrow (I have to strengthen the tests).
>>
>> On Nov 14, 2017, at 9:29 AM, Quentin Colombet via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>>
>> Thanks Oliver.
>> I’ll have a look. This typically means that we miss a mapping for this type/instruction, which is not surprising given how little code we have we fp16.
>>
>>
>> On Nov 14, 2017, at 2:27 AM, Oliver Stannard <oliver.stannard at arm.com <mailto:oliver.stannard at arm.com>> wrote:
>>
>> Hi Quentin,
>>
>> I’ve started running an ABI test suite with global isel on AArch64, and while it hasn’t found any ABI issues it has hit an assertion in clang when using the __fp16 type. Here’s a reproducer:
>>
>> __fp16 pass_f16(__fp16 p) {
>> return p;
>> }
>>
>> $ /work/llvm/build/bin/clang --target=aarch64-arm-none-eabi -march=armv8-a -c test.c -O0 -mllvm -global-isel -mllvm -global-isel-abort=0
>> clang-6.0: /work/llvm/llvm/lib/CodeGen/GlobalISel/RegisterBankInfo.cpp:446: static void llvm::RegisterBankInfo::applyDefaultMapping(const llvm::RegisterBankInfo::OperandsMapper &): Assertion `OrigTy.getSizeInBits() == NewTy.getSizeInBits() && "Types with difference size cannot be handled by the default " "mapping"' failed.
>> #0 0x000000000362a764 PrintStackTraceSignalHandler(void*) (/work/llvm/build/bin/clang-6.0+0x362a764)
>> #1 0x000000000362aac6 SignalHandler(int) (/work/llvm/build/bin/clang-6.0+0x362aac6)
>> #2 0x00007f9193b78330 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x10330)
>> #3 0x00007f919276bc37 gsignal /build/eglibc-oGUzwX/eglibc-2.19/signal/../nptl/sysdeps/unix/sysv/linux/raise.c:56:0
>> #4 0x00007f919276f028 abort /build/eglibc-oGUzwX/eglibc-2.19/stdlib/abort.c:91:0
>> #5 0x00007f9192764bf6 __assert_fail_base /build/eglibc-oGUzwX/eglibc-2.19/assert/assert.c:92:0
>> #6 0x00007f9192764ca2 (/lib/x86_64-linux-gnu/libc.so.6+0x2fca2)
>> #7 0x0000000003d70eb9 (/work/llvm/build/bin/clang-6.0+0x3d70eb9)
>> #8 0x0000000003d6b00c llvm::RegBankSelect::applyMapping(llvm::MachineInstr&, llvm::RegisterBankInfo::InstructionMapping const&, llvm::SmallVectorImpl<llvm::RegBankSelect::RepairingPlacement>&) (/work/llvm/build/bin/clang-6.0+0x3d6b00c)
>> #9 0x0000000003d6b366 llvm::RegBankSelect::assignInstr(llvm::MachineInstr&) (/work/llvm/build/bin/clang-6.0+0x3d6b366)
>> #10 0x0000000003d6b7f1 llvm::RegBankSelect::runOnMachineFunction(llvm::MachineFunction&) (/work/llvm/build/bin/clang-6.0+0x3d6b7f1)
>> #11 0x0000000002d934c8 llvm::MachineFunctionPass::runOnFunction(llvm::Function&) (/work/llvm/build/bin/clang-6.0+0x2d934c8)
>> #12 0x00000000030c998f llvm::FPPassManager::runOnFunction(llvm::Function&) (/work/llvm/build/bin/clang-6.0+0x30c998f)
>> #13 0x00000000030c9c53 llvm::FPPassManager::runOnModule(llvm::Module&) (/work/llvm/build/bin/clang-6.0+0x30c9c53)
>> #14 0x00000000030ca136 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/work/llvm/build/bin/clang-6.0+0x30ca136)
>> #15 0x00000000037c3dcf clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::DataLayout const&, llvm::Module*, clang::BackendAction, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream> >) (/work/llvm/build/bin/clang-6.0+0x37c3dcf)
>> #16 0x0000000003d421a0 clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) (/work/llvm/build/bin/clang-6.0+0x3d421a0)
>> #17 0x0000000004457376 clang::ParseAST(clang::Sema&, bool, bool) (/work/llvm/build/bin/clang-6.0+0x4457376)
>> #18 0x0000000003ca6ea0 clang::FrontendAction::Execute() (/work/llvm/build/bin/clang-6.0+0x3ca6ea0)
>> #19 0x0000000003c1fa31 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/work/llvm/build/bin/clang-6.0+0x3c1fa31)
>> #20 0x0000000003d3bf4b clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/work/llvm/build/bin/clang-6.0+0x3d3bf4b)
>> #21 0x0000000001f85629 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/work/llvm/build/bin/clang-6.0+0x1f85629)
>> #22 0x0000000001f83096 main (/work/llvm/build/bin/clang-6.0+0x1f83096)
>> #23 0x00007f9192756f45 __libc_start_main /build/eglibc-oGUzwX/eglibc-2.19/csu/libc-start.c:321:0
>> #24 0x0000000001f80029 _start (/work/llvm/build/bin/clang-6.0+0x1f80029)
>> Stack dump:
>> 0. Program arguments: /work/llvm/build/bin/clang-6.0 -cc1 -triple aarch64-arm-none-eabi -emit-obj -mrelax-all -disable-free -main-file-name test.c -mrelocation-model static -mthread-model posix -mdisable-fp-elim -fmath-errno -masm-verbose -mconstructor-aliases -fuse-init-array -target-cpu generic -target-feature +neon -target-abi aapcs -fallow-half-arguments-and-returns -dwarf-column-info -debugger-tuning=gdb -coverage-notes-file /work/innovation/cctest/test.gcno -resource-dir /work/llvm/build/lib/clang/6.0.0 -O0 -fdebug-compilation-dir /work/innovation/cctest -ferror-limit 19 -fmessage-length 226 -fno-signed-char -fobjc-runtime=gcc -fdiagnostics-show-option -fcolor-diagnostics -mllvm -global-isel -mllvm -global-isel-abort=0 -o test.o -x c test.c
>> 1. <eof> parser at end of file
>> 2. Code generation
>> 3. Running pass 'Function Pass Manager' on module 'test.c'.
>> 4. Running pass 'RegBankSelect' on function '@pass_f16'
>> clang-6.0: error: unable to execute command: Aborted (core dumped)
>> clang-6.0: error: clang frontend command failed due to signal (use -v to see invocation)
>> clang version 6.0.0 (ssh://olista01@ds-gerrit.euhpc.arm.com:29418/armcompiler/clang <ssh://olista01@ds-gerrit.euhpc.arm.com:29418/armcompiler/clang> aa2b9952ef98a5fe2d47384ef17106855b8bae51) (ssh://olista01@ds-gerrit.euhpc.arm.com:29418/armcompiler/llvm <ssh://olista01@ds-gerrit.euhpc.arm.com:29418/armcompiler/llvm> 29f89772107a79b5f2a816d4748ed9c19416c1b6)
>> Target: aarch64-arm-none-eabi
>> Thread model: posix
>> InstalledDir: /work/llvm/build/bin
>> clang-6.0: note: diagnostic msg: PLEASE submit a bug report to http://llvm.org/bugs/ <http://llvm.org/bugs/> and include the crash backtrace, preprocessed source, and associated run script.
>> clang-6.0: note: diagnostic msg:
>> ********************
>>
>> PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
>> Preprocessed source(s) and associated run script(s) are located at:
>> clang-6.0: note: diagnostic msg: /tmp/test-e06964.c
>> clang-6.0: note: diagnostic msg: /tmp/test-e06964.sh
>> clang-6.0: note: diagnostic msg:
>>
>> ********************
>>
>> Oliver
>>
>> From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org <mailto:llvm-dev-bounces at lists.llvm.org>] On Behalf Of Quentin Colombet via llvm-dev
>> Sent: 13 November 2017 18:27
>> To: Kristof Beyls
>> Cc: llvm-dev; nd; Ahmed Bougacha; Justin Bogner; Aditya Nandakumar
>> Subject: Re: [llvm-dev] [GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try!
>>
>> Hi Kristof,
>>
>>
>> On Nov 13, 2017, at 9:10 AM, Kristof Beyls <Kristof.Beyls at arm.com <mailto:Kristof.Beyls at arm.com>> wrote:
>>
>> Hi Quentin,
>>
>> My only remaining concern is around ABI compatibility.
>> The following commit seems to indicate that in the previous round of evaluation, we didn’t find an existing ABI compatibility issue:
>> http://llvm.org/viewvc/llvm-project?view=revision&revision=311388 <http://llvm.org/viewvc/llvm-project?view=revision&revision=311388>.
>> I haven’t looked into the details of this issue - so maybe I’m worried over nothing?
>>
>> No, you’re right. The problem with ABI is if you are consistently wrong, then you won’t notice :).
>>
>>
>>
>> I’m wondering if since then on your side you did any testing around ABI compatibility?
>> E.g. building software where you semi-randomly build some functions through GlobalISel and some functions through DAGISel?
>>
>> Justin will look into that. Clang has utility script for that utils/ABITest.
>>
>> Given we will only be able to check iOS ABI, you may want to follow the same kind of validation on your side.
>>
>> I let you sync up with Justin for the method.
>>
>> Cheers,
>> -Quentin
>>
>>
>>
>> Thanks,
>>
>> Kristof
>>
>> On 8 Nov 2017, at 00:42, Quentin Colombet via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>>
>> Hi all,
>>
>> I’d like to resurrect this thread and ask if people are on board for enabling this by default for AArch64 O0.
>>
>>
>> *** What Changed Since June? ***
>>
>> - We added a way to describe the legalization actions for non-power-of-2
>> - We gave a tutorial that covers the best practices to target GlobalISel
>> - We improved the TableGen backend to reuse existing SDISel patterns
>> - We built and ran huge internal software with GISel
>> - We evaluated the performance of GISel and are confident things are in a good shape (with https://reviews.llvm.org/D39034 <https://reviews.llvm.org/D39034>) and moving forward would look even better (see the last LLVM Dev talk: GlobalISel: Present, Past, and Future when it is available)
>>
>>
>> *** So What’s he Plan? ***
>>
>> - Switch the default instruction selector to GISel for AArch64 at O0
>> - Enable the fallback path by default for AArch64 (with warnings enabled when that path is hit)
>> - Provide a clang option to turn GISel off
>>
>> What do you think?
>>
>> Thanks,
>> -Quentin
>>
>>
>> On Jun 16, 2017, at 4:43 PM, Quentin Colombet <qcolombet at apple.com <mailto:qcolombet at apple.com>> wrote:
>>
>> Hi all,
>>
>> We had some internal discussions about flipping the default for O0 and we concluded that we wanted to postpone it.
>>
>>
>> *** Why Is That? ***
>>
>> We don’t want to send the wrong message that GlobalISel’s design is set in stone and ready for broader adoption.
>> In particular,
>> 1. The APIs are still evolving and can still possibly change significantly
>> 2. The TableGen backend to reuse the existing SD patterns is still at its early stage
>> 3. We want to investigate closely the performance of global-isel (compile-time, runtime, code size, fallbacks)
>>
>> The rationale behind those items is that we want to minimize the pain of moving forward for everybody. We also want the out-of-the-box experience to be pleasant (like all/most of the tablegen patterns just work, we have documentation on how to target a new backend, etc.) Finally, we want to gain confidence we are going to be able to address the performance issues we have with the current design and if not, derive a plan for that.
>>
>> We purposely left out of the conversation what will be the right time and requirements to flip the switch. We want to gather more data first. Your help would be appreciated!
>>
>>
>> *** Short-Term Proposal ***
>>
>> What we would like to do instead short-term is:
>> A. Repurpose or create an option “-aarch64-enable-global-isel-at-O” to enable GISel with fallbacks and warnings enables (i.e., equivalent of -global-isel -global-isel-abort=2)
>> B. Advertise this option in the next open source release to allow compiler enthusiastic to try it and report problems
>> C. Have GISel always built so we can push thing in the right place, MachineVerifier in mind, and stop doing some weird gymnastic
>>
>> What do people think?
>>
>>
>> *** Your Help Is Needed ***
>>
>> - Please share your experience in using the GISel APIs and how we can make them better. Moving forward we’ll have those conversations on open source instead of internally/with a narrower audience.
>> - Report any performance problem you identify
>> - Propose patches!
>>
>> Cheers,
>> -Quentin
>>
>>
>>
>> On Jun 16, 2017, at 3:06 PM, Quentin Colombet via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>>
>>
>> On Jun 14, 2017, at 7:27 AM, Diana Picus <diana.picus at linaro.org <mailto:diana.picus at linaro.org>> wrote:
>>
>> On 12 June 2017 at 18:54, Diana Picus <diana.picus at linaro.org <mailto:diana.picus at linaro.org>> wrote:
>> Hi all,
>>
>> I added a buildbot [1] running the test-suite with -O0 -global-isel. It runs into the same 2 timeouts that I reported previously on this thread (paq8p and scimark2). It would be nice to make it green before flipping the switch.
>>
>>
>> I did some more investigations on a machine similar to the one running the buildbot. For paq8p and scimark2, I get these results for O0:
>>
>> PAQ8p:
>> Fast isel: 666.344
>> Global isel: 731.384
>>
>> SciMark2-C:
>> Fast isel: 463.908
>> Global isel: 496.22
>>
>> The current timeout is 500s (so in this particular case we didn't hit it for scimark2, and it ran successfully to completion). I don't think the difference between FastISel and GlobalISel is too atrocious, so I would propose increasing the timeout for these 2 benchmarks. I'm not sure if we can do this on a per-bot basis, but I see some precedent for setting custom timeout thresholds for various benchmarks on different architectures (sometimes with comments that it's done so we can run O0 on that particular benchmark).
>>
>> Something along these lines works:
>> https://reviews.llvm.org/differential/diff/102547/ <https://reviews.llvm.org/differential/diff/102547/>
>>
>> What do you guys think about this approach?
>>
>> Looks reasonable to me.
>>
>>
>>
>> Thanks,
>> Diana
>>
>> PS: The buildbot is using the Makefiles because that's what our other AArch64 test-suite bots use. Moving all of them to CMake is a transition for another time.
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171211/e3235c0f/attachment-0001.html>
More information about the llvm-dev
mailing list