[llvm-dev] [RFC] Turn the MachineOutliner on by default in AArch64 under -Oz

Mon Apr 23 15:55:13 PDT 2018

Sorry, I was using a modified compiler, which by coincidence made the 
bug much easier to reproduce.

In some rare cases, the compiler will use x30 as a general-purpose 
register; in that case, outlining breaks because the "ret" branches to 
the wrong address.  Testcase (reproduce with "clang -O3 
--target=aarch64-pc-linux-gnu -mllvm -enable-machine-outliner"):

extern long g1;
extern long g2;
void foo() {
   register long *x asm("x27") = &g1;
   register long *y asm("x29") = &g1;
   register long *z asm("x30") = &g2;
   asm(""::"r"(x),"r"(y),"r"(z));
}
void foo2() {
   register long *x asm("x27") = &g1;
   register long *y asm("x29") = &g1;
   register long *z asm("x30") = &g2;
   asm(""::"r"(x),"r"(y),"r"(z));
}
void foo3() {
   register long *x asm("x27") = &g1;
   register long *y asm("x29") = &g1;
   register long *z asm("x30") = &g2;
   asm(""::"r"(x),"r"(y),"r"(z));
}

-Eli

On 4/23/2018 2:37 PM, Jessica Paquette wrote:
> I just ran SPEC at -O3 with the outliner enabled for AArch64 and 
> didn’t get any failures on my end. Which flags did you use? I’m 
> curious about what’s going on here...
>
> I used -O3 -mllvm -enable-machine-outliner -arch arm64.
>
> - Jessica
>
>> On Apr 23, 2018, at 1:41 PM, Jessica Paquette <jpaquette at apple.com 
>> <mailto:jpaquette at apple.com>> wrote:
>>
>> Hi Eli,
>>
>>> I just tried some tests, and I'm seeing a bunch of failures on SPEC 
>>> at -O3; looks like mostly crashes at runtime.   I can try to reduce 
>>> a testcase if you need it.
>> If you could do that, that would be great. Our testing has been 
>> primarily for -Oz and -O2, so I haven’t looked at -O3 at all.
>>
>>> I don't think this is really the right approach.  With LTO, you can 
>>> have a mix of functions, some of which are minsize, and some of 
>>> which are not.  Or with profile info, we might want to outline only 
>>> cold code (I guess this isn't implemented yet, but potentially 
>>> future work).  Tying whether we run the outliner to a command-line 
>>> flag restricts the possible uses; either the entire module gets 
>>> outlining, or none of it does.
>> I’m worried that walking the entire list of functions in the module 
>> when nothing has the minsize attribute would incur unnecessary 
>> compile-time overhead. If that’s a reasonable thing to do though, I’m 
>> fine with that approach. It’d be a less invasive change, and would 
>> give us the desired LTO behaviour for free.
>>
>> - Jessica
>>
>>
>>> On Apr 23, 2018, at 1:24 PM, Friedman, Eli <efriedma at codeaurora.org 
>>> <mailto:efriedma at codeaurora.org>> wrote:
>>>
>>> On 4/20/2018 7:06 PM, Jessica Paquette via llvm-dev wrote:
>>>> We perform regular testing to ensure the outliner produces correct 
>>>> AArch64 code at -Oz. Tests include the LLVM test suite and standard 
>>>> external test suites such as SPEC. All tests compile and 
>>>> execute. We've also been making sure that the outliner produces 
>>>> debuggable code. Users are still guaranteed to have sane backtraces 
>>>> in the presence of outlined functions.
>>>>
>>>> Added exposure to various programs would help the outlining 
>>>> algorithm mature further. This, in turn, will help the overall 
>>>> outlining project. For example, there have been a few discussions 
>>>> on implementing an IR-level outlining pass [3, 4]. Ultimately, the 
>>>> goal is to create a shared outlining interface. This interface 
>>>> would allow the outliner to exist at any level of representation 
>>>> [4]. The general outlining algorithm will be part of the shared 
>>>> interface. Thus, in the spirit of incremental improvement, it makes 
>>>> sense to begin "stress-testing" it sooner than later.
>>>
>>> I just tried some tests, and I'm seeing a bunch of failures on SPEC 
>>> at -O3; looks like mostly crashes at runtime.   I can try to reduce 
>>> a testcase if you need it.
>>>
>>>>
>>>> There are a few patches necessary to facilitate this. They are 
>>>> available in the patches section of this email. I’ll summarize what 
>>>> they do here for the sake of discussion though.
>>>>
>>>> The first patch is one that teaches the backend about size 
>>>> optimization levels. This is comparable to what's done in the 
>>>> inliner. Today, the only way to tell if something is optimizing for 
>>>> size is by looking at function attributes. This is fine for 
>>>> function passes, but insufficient for module passes like the 
>>>> MachineOutliner. The function attribute approach forces the 
>>>> outliner to iterate over every function in the module before 
>>>> deciding to take action. If -Oz isn't passed in, then the outliner 
>>>> will not find any functions worth outlining from. This would incur 
>>>> unnecessary compile-time overhead. Thus, we decided the best course 
>>>> of action is to teach the backend about size options.
>>>
>>> I don't think this is really the right approach.  With LTO, you can 
>>> have a mix of functions, some of which are minsize, and some of 
>>> which are not. Or with profile info, we might want to outline only 
>>> cold code (I guess this isn't implemented yet, but potentially 
>>> future work).  Tying whether we run the outliner to a command-line 
>>> flag restricts the possible uses; either the entire module gets 
>>> outlining, or none of it does.
>>>
>>> In general, we've been moving away from global settings so we can 
>>> optimize more effectively in this sort of scenario.
>>>
>>> -Eli
>>> -- 
>>> Employee of Qualcomm Innovation Center, Inc.
>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
>>
>

-- 
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180423/8d7f7c96/attachment.html>