[llvm] r208934 - Implement global merge optimization for global variables.

Quentin Colombet qcolombet at apple.com
Mon May 19 09:55:53 PDT 2014


Hi,
On May 17, 2014, at 6:15 AM, Rafael EspĂ­ndola <rafael.espindola at gmail.com> wrote:

>> I don't think Nick is understanding this issue correctly as I mentioned in
>> my last reply, so I'm not sure if you are really agree with him. The issue
>> you mentioned below is actually completely different from Nick's point.
>> Refer to my answer below, anyway.
> 
> The part about unnamed_addr, yes, but not the part about alignment,
> right? Things like getGlobalMergeAlignment do look out of place.
Again, but looks like my reply did not make it to the list, we give up on any global that uses a special alignment:
    // Ignore fancy-aligned globals for now.
    unsigned Alignment = DL->getPreferredAlignment(I);
    Type *Ty = I->getType()->getElementType();
    if (Alignment > DL->getABITypeAlignment(Ty))
      continue;

> 
>>> Is
>>> this uncommon enough that checking for uses coming from the same
>>> function was not considered profitable?
>> 
>> 
>> I'm not sure if this could be profitable, because they are global variables
>> and we don't really know how they are being referenced in other modules. The
>> heuristic of considering the uses within a function for current module would
>> not probably give the same benefit to other modules, so I'm thinking the
>> heuristic strategy of merging them all together for current module would not
>> make big difference from the one of further checking uses within a function
>> for current module.
>> 
>> I never say my optimization patch is commonly profitable, and this is why
>> they are under switches control. But for some specific cases, it would be
>> profitable, I believe In particular, for the case that global variables are
>> being heavily used in the module that are defining them.
> 
> Well, we normally don't want to have too many use facing switches.
> What is the plan for removing this one? If variable concatenation (to
> avoid confusion with merging which causes the address to be the same)
> is profitable enough for ARM/AARCH64 that it is better than the
> inability to dead strip some code, then it should always be enabled.
> If not, we should figure out an heuristic for when to do it.
> 
> It should also be done really late. In particular, it should probably
> not run with -flto and instead be enabled in the LTO pipeline.
I agree with the LTO thing.
That said, this pass cannot and is not run late in the pipeline. Basically, it changes the code during the pass initialization (pretty broken I know!), but this is currently mandatory otherwise the debug information are wrong:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130415/171763.html

> 
> I also don't follow your objection to looking at the function bodies
> to decide when it is profitable. If a variable is external, codegen
> must produce the fully general access pattern. That is, the only code
> that benefits from concatenating global variables is the one in the
> same TU as the variables, right?
I agree, but I point out that this pass is currently very cheap (in term of compile time) because it has a very naive heuristic (merge variable per size). If we want to change that, this is likely that we will break the nice compile time behavior. I am not opposed to that, this is just something we must be aware of.
If we want to do something more clever, it is likely what we end up with a coalescing problem with all the issues this kind of solution may have.

Anyhow, whatever we decide to do, we must carefully benchmark it.

Thanks,
-Quentin
> 
>>> putting two variables that a function uses next to each other would
>>> also be a win there, no?
>> 
>> 
>> X86 and ARM/AArch64 have different address model, and this is because of the
>> so-called CSIC and RISC differences. X86 can directly encode global address
>> relocation into instructions, while ARM/AArch64 can only encode it in
>> load/store instructions. For AArch64, this is getting worse, because ADRP
>> instruction can only get the page address of global variable due to the
>> limitation of instruction encoding size (32-bit only), and we have to
>> introduce another add instruction to get the 64-bit address. This is why
>> this optimization is very meaningful to AArch64. I think you would be able
>> to understand this much better if you look into the very beginning part of
>> my patch review thread.
>> 
>> Yes, putting two global variables that a function next each other would be a
>> win, but as I mentioned above this is also heuristic because we don't know
>> how global variables are being used by other modules at compilation time.
> 
> It is heuristic. I was just pointing out that it is not fundamentally
> an ARM only thing.
> 
>> 
>> OK. Thanks for letting me know that. Hopefully my patch still can reuse your
>> new GlobalAlias solution. Looking at the bug tracker you pointed to me, it
>> seems to be a very old problem raised by Christ long time ago. I'm not sure
>> if you can really make it happen next week. I would be extremely
>> appreciative if you do.
> 
> I think I can. The main change is already in. What is missing is just
> the adding offsets.
> 
> Cheers,
> Rafael
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140519/443d28c7/attachment.html>


More information about the llvm-commits mailing list