[llvm] r208934 - Implement global merge optimization for global variables.

Sat May 17 06:15:25 PDT 2014

> I don't think Nick is understanding this issue correctly as I mentioned in
> my last reply, so I'm not sure if you are really agree with him. The issue
> you mentioned below is actually completely different from Nick's point.
> Refer to my answer below, anyway.

The part about unnamed_addr, yes, but not the part about alignment,
right? Things like getGlobalMergeAlignment do look out of place.

>> Is
>> this uncommon enough that checking for uses coming from the same
>> function was not considered profitable?
>
>
> I'm not sure if this could be profitable, because they are global variables
> and we don't really know how they are being referenced in other modules. The
> heuristic of considering the uses within a function for current module would
> not probably give the same benefit to other modules, so I'm thinking the
> heuristic strategy of merging them all together for current module would not
> make big difference from the one of further checking uses within a function
> for current module.
>
> I never say my optimization patch is commonly profitable, and this is why
> they are under switches control. But for some specific cases, it would be
> profitable, I believe In particular, for the case that global variables are
> being heavily used in the module that are defining them.

Well, we normally don't want to have too many use facing switches.
What is the plan for removing this one? If variable concatenation (to
avoid confusion with merging which causes the address to be the same)
is profitable enough for ARM/AARCH64 that it is better than the
inability to dead strip some code, then it should always be enabled.
If not, we should figure out an heuristic for when to do it.

It should also be done really late. In particular, it should probably
not run with -flto and instead be enabled in the LTO pipeline.

I also don't follow your objection to looking at the function bodies
to decide when it is profitable. If a variable is external, codegen
must produce the fully general access pattern. That is, the only code
that benefits from concatenating global variables is the one in the
same TU as the variables, right?

>> putting two variables that a function uses next to each other would
>> also be a win there, no?
>
>
> X86 and ARM/AArch64 have different address model, and this is because of the
> so-called CSIC and RISC differences. X86 can directly encode global address
> relocation into instructions, while ARM/AArch64 can only encode it in
> load/store instructions. For AArch64, this is getting worse, because ADRP
> instruction can only get the page address of global variable due to the
> limitation of instruction encoding size (32-bit only), and we have to
> introduce another add instruction to get the 64-bit address. This is why
> this optimization is very meaningful to AArch64. I think you would be able
> to understand this much better if you look into the very beginning part of
> my patch review thread.
>
> Yes, putting two global variables that a function next each other would be a
> win, but as I mentioned above this is also heuristic because we don't know
> how global variables are being used by other modules at compilation time.

It is heuristic. I was just pointing out that it is not fundamentally
an ARM only thing.

>
> OK. Thanks for letting me know that. Hopefully my patch still can reuse your
> new GlobalAlias solution. Looking at the bug tracker you pointed to me, it
> seems to be a very old problem raised by Christ long time ago. I'm not sure
> if you can really make it happen next week. I would be extremely
> appreciative if you do.

I think I can. The main change is already in. What is missing is just
the adding offsets.

Cheers,
Rafael