[LLVMdev] Proposal: AArch64/ARM64 merge from EuroLLVM

Jiangning Liu liujiangning1 at gmail.com
Thu Apr 17 19:56:13 PDT 2014


Hi Quentin,

Thanks for your kindly help!

> The problem is MOVaddr generated for ARM64  implies introducing adrp in
> ExpandPseudoInsts pass again, although at this moment we don't really see
> redundant ADRP yet. AArch64 is using ADDxxi_lsl0_s instead, and it will be
> folded into LS32_STR finally.
>
> Interesting.
> Looks like we are too clever here.
> I would have expected ISel to generate one base address and one
> displacement.
>
> I believe that if we fix that both the LOHs and the global merge become
> orthogonal. My guess is that we should be less aggressive at folding offset
> if there are several uses.
>
>
Sounds great! I thought I had misunderstanding.

> If simply apply the global merge solution to ARM64, probably we should
>> avoid generating pseudo instruction MOVaddr and friends in ISEL stage, but
>> I'm not sure if the LOH solution would still work or not, because,
>> 1) ARM64 link-time optimization depends on LOH.
>> 2) We don't see linker plug-in in LLVM trunk and it would be hard for me
>> to verify any thoughts.
>>
>> The LOH solution is also orthogonal. You can see that as a last chance
>> way to optimize those accesses.
>> That said, if you CSE the ADRP and not the LOADGot, you will indeed
>> create far less candidates for the LOHs because you will have ADRPs with
>> several uses, which is not supported by LOHs.
>>
>
> Yes. This is just what I'm worrying about. So essentially those two
> optimizations have conflict.
>
> Let us try to fix the codegen problem while keeping the pseudos.
>

Currently, we have the following code for ARM64,

// The MOVaddr instruction should match only when the add is not folded
// into a load or store address.
def MOVaddr
    : Pseudo<(outs GPR64:$dst), (ins i64imm:$hi, i64imm:$low),
             [(set GPR64:$dst, (ARM64addlow (ARM64adrp tglobaladdr:$hi),
                                            tglobaladdr:$low))]>,
      Sched<[WriteAdrAdr]>;

Does it mean ISEL will generate pseudo MOVaddr as long as pattern
"(ARM64addlow (ARM64adrp $hi), $low)" exists? So I think we should remove
this pseudo, and I don't understand what do you mean by "keeping the
pseudos". Are there any other purposes of introducing pseudo MOVaddr?

> Since compile-time ADRP CSE is not so powerful as link-time ADRP removal,
> I don't want to hurt link-time solution.
>
> Well, this is something that should be measured. Your patch does not kill
> the LOHs, it may just reduce the number of potential candidates. For each
> candidate that your patch removes, it means we at least spare one ADRP
> instruction. The trade-off does not seem bad.
>
> I suggest we:
> 1. Fix the ISel of pseudo (making the folding less aggressive).
> 2. Measure the performance with your patch.
>
> I can definitely help for the measurements with the LOHs enabled in
> parallel with your patch.
> If you want I can help for #1 too.
>

You are so nice and I'm glad that you can help both, because
1) I don't have 64-bit hardware yet
2) I don't have the link plug-in either
3) I will be busy at another high priority bug fix in a week


> Side question, did you happen to measure any performance
> improvement/regression with your patch?
>

I don't have hardware yet, so I myself didn't, but Ana helped me to measure
it on A53 previously, but the data doesn't show consistency for two
separate measurements, so I was not convinced by that data yet. But the
data does show some sporadic improvements for some tests in EEMBC.


> I’d like to know which tests would be good candidates to measure the
> impact of your patch + LOHs enabled.
>

With my patch only, I expect 256.bzip2 and 252.eon have some performance
change because they have 42% and 52% adrp reduction percentage respectively.

For LOH, I think linker can cover a lot of more cases like the global
variable is not defined in the file being compiled. I need to collect more
data around LOH, and do you have any idea how to measure LOH effect
statically? Counting the number of LOH is enough?

Thanks,
-Jiangning


>
> Thanks,
> -Quentin
>
>
>> Thanks,
>> -Quentin
>>
>>
>> Thanks,
>> -Jiangning
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140418/9045a196/attachment.html>


More information about the llvm-dev mailing list