[PATCH] Global merge on constants

Quentin Colombet qcolombet at apple.com
Wed Mar 13 11:26:23 PDT 2013


Hi David,

I was looking at Duncan’s proposal regarding the creation of GlobalAlias to GEP with non zero offset for merged global variables.
Based on Duncan’s comment, I’ve been able to find a PR that you initiated on that:
http://www.llvm.org/bugs/show_bug.cgi?id=4739

The resolution of this PR was INVALID.

If I understand correctly this PR, it is not possible to create aliases to the middle of a structure for Mach-O target.
What is the current status for other targets?

Also, what do you think would be a viable approach for the global merge problem?

Thanks,

-Quentin

On Mar 12, 2013, at 10:16 AM, Quentin Colombet <qcolombet at apple.com> wrote:

> Hi Duncan,
> 
> Thanks for the insightful information!
> 
> If I've understood correctly what you’re saying, the current process of the global merge pass is wrong since it may merge globals referenced in inline assembly statement.
> 
> If yes, we should file a PR!
> 
> I like you’re idea with aliases.
> 
> Is there a way to know if aliases and this kind of aliases are supported on the current platforms?
> 
> Assuming I can use aliases to achieve a valid global merging, it looks like it is incompatible with the spirit of http://llvm.org/bugs/show_bug.cgi?id=10367.
> 
> What do you think?
> 
> Thanks,
> 
> -Quentin
> 
> On Mar 12, 2013, at 1:07 AM, Duncan Sands <baldrick at free.fr> wrote:
> 
>> Hi Quentin,
>> 
>>> Sorry for the late reply, I was checking the information Anton’s pointed out and
>>> now, I need your opinions on what I saw.
>>> 
>>> So, according to the documentation, TypeInfos and more specifically the
>>> arguments of the clause part in landing pad instructions are supposed to be
>>> GlobalVariable.
>> 
>> I'm pretty sure it needs to be a GlobalVariable or a bitcast of one.  In short
>> something which turns into a simple label in the assembler.
>> 
>>> 
>>> Now, if you look at the code in the backend, the Verifier pass checks that the
>>> argument of the catch clause is Constant (which is a superclass of
>>> GlobalVariable and GEP) and that the type is a pointer type
>>> (Verifier::visitLandingPadInst).
>> 
>> At the IR level it doesn't matter much what a typeinfo is.  The restrictions are
>> coming from codegen.
>> 
>>> 
>>> Note: The type doesn’t change when the variable is merged (otherwise there is
>>> something wrong!).
>>> 
>>> Similarly, LLParser (LLParser::ParseLandingPad) and BitcodeReader
>>> (BitcodeReader::ParseFunctionBody) do not rely on GlobalVariable, they just
>>> check that the type is not an ArrayType.
>>> 
>>> Then, for the type info properly speaking, MachineModuleInfo uses GlobalVariable
>>> to map the type to an id (MachineModuleInfo::getTypeIDFor related) and
>>> AsmPrinter uses their name for printing.
>>> 
>>> As far as I can tell, it feels like it is no more a problem to merge the global
>>> constants (i.e., potentially changing a GlobalVariable into a GEP), even if they
>>> are related to EH. In particular, the type info is currently built using
>>> llvm::ExtractTypeInfo, which already takes a Value and do a stripPointerCast
>>> that gets a global variable even from the GEP and GEP is already a subclass of
>>> Constant like GlobalVariable.
>>> 
>>> Therefore, do you think it is still a problem to merge global constant?
>> 
>> Yes, as I mentioned above I think the typeinfo needs to be a GlobalVariable
>> (or a bitcast of one) not an offset into one (GEP).  See below for some thoughts
>> on how to relax this.
>> 
>>> 
>>> One last thing, I don’t know if .ll file can hit that, since I don’t know if
>>> clang is doing much work on then, but clang also expects GlobalVariable for
>>> landing pad instruction in one very specific case for ObjC++ exception
>>> (PersonalityHasOnlyCXXUses). From my tests, ll files don’t get into that path
>>> but maybe I’m missing something.
>>> What do you think?
>>> 
>>> Thanks for your help,
>>> 
>>> -Quentin
>>> PS: dragon egg seems fine with Constant for clauses in landing pad instruction too!
>> 
>> Front-ends know what they produce, they aren't really relevant here.  Dragonegg
>> produces (bitcasts of) GlobalVariables.  Using Constant is just a convenient way
>> of handling both GlobalVariables and bitcasts of global variables in a uniform
>> way.
>> 
>> Note that it isn't only exception handling typeinfos that is an issue here:
>> people using a GlobalVariable G in inline asm may well refer to it by its
>> name "G" in the inline asm.  Merging globals could break this.
>> 
>> Here is a possible solution to all these issues:
>> 
>> (1) Enhance aliases so that they can be offsets into a global, not just an alias
>> to a global (there is a PR requesting this somewhere);
>> (2) When merging a global G into a mega-global, turn "G" into an alias pointing
>> to the position in the mega global.
>> 
>> Eg:
>> 
>> Before:
>> 
>>  @F = global...
>>  @G = global...
>>  @H = global...
>> 
>> After:
>> 
>>  @Merged = global ... contains F, G then H
>>  @F = alias bitcast @Merged to ...
>>  @G = alias getelementptr @Merged, ...
>>  @H = alias getelementptr @Merged, ...
>> 
>> This should then result in F, G and H turning up as labels in the assembler,
>> usable as typeinfos.  It would likewise solve the inline asm issue.
>> 
>> The problems are: I'm not sure aliases are supported on all platforms, and
>> I'm even less sure which platforms support this kind of alias.
>> 
>> Ciao, Duncan.
>> 
>> 
>>> 
>>> On Mar 1, 2013, at 10:53 AM, Anton Korobeynikov <anton at korobeynikov.info
>>> <mailto:anton at korobeynikov.info>> wrote:
>>> 
>>>> Hi Quentin,
>>>> 
>>>> Indeed, GlobalMerge did not merge global constants because EH
>>>> processing code expected typeinfo and other stuff to be globals, not
>>>> gep's.
>>>> 
>>>> The situation changed with addition of landingpad instruction. You
>>>> might want to check how it handle GEPs for typeinfos (though, I'd say
>>>> that LangRef states that typeinfo is a global).
>>>> 
>>>> 
>>>> On Fri, Mar 1, 2013 at 10:05 PM, Quentin Colombet <qcolombet at apple.com
>>>> <mailto:qcolombet at apple.com>> wrote:
>>>>> Hi Duncan,
>>>>> 
>>>>> I've checked in LLVM documentation and the unnamed_addr marker is not
>>>>> specific to global variables that are constant. Hence, based on your
>>>>> comment, I was thinking that the global merge pass may perform illegal
>>>>> transformations.
>>>>> Actually, I thought about it and from my understanding, merging two globals
>>>>> like global merge pass does is not incompatible with having two distinct
>>>>> addresses.
>>>>> Indeed, global merge keeps all globals but wraps them into a structure.
>>>>> E.g.,
>>>>> int a;
>>>>> int b;
>>>>> =>
>>>>> struct {
>>>>> int a;
>>>>> int b;
>>>>> } MergedGV;
>>>>> 
>>>>> Thus, &MergedGV.a != &MergedGV.b
>>>>> 
>>>>> This brings me back to my initial question, does gathering constants into
>>>>> structures still cause a problem with EH since it does not seem incompatible
>>>>> with unnamed_addr?
>>>>> 
>>>>> Note that I've run the test case that was triggering the failure, and it
>>>>> works with my patch enabled. But since I don't know how EH are handled I
>>>>> prefer to have an external opinion.
>>>>> 
>>>>> Reference of the test case triggering the failure:
>>>>> http://llvm.org/bugs/show_bug.cgi?id=7716
>>>>> 
>>>>> Thanks,
>>>>> 
>>>>> -Quentin
>>>>> 
>>>>> On Feb 28, 2013, at 10:08 AM, Quentin Colombet <qcolombet at apple.com> wrote:
>>>>> 
>>>>> Ping?
>>>>> 
>>>>> -Quentin
>>>>> 
>>>>> On Feb 27, 2013, at 9:55 AM, Quentin Colombet <qcolombet at apple.com> wrote:
>>>>> 
>>>>> Hi Duncan,
>>>>> 
>>>>> Does it mean that the current behavior of global merge is wrong or is this
>>>>> unnamed_addr marker apply only to global constants?
>>>>> 
>>>>> Anyway, thanks for the information.
>>>>> 
>>>>> -Quentin
>>>>> 
>>>>> On Feb 27, 2013, at 12:50 AM, Duncan Sands <baldrick at free.fr> wrote:
>>>>> 
>>>>> Hi Quentin, you should only be merging them if they are marked with
>>>>> unnamed_addr (this indicates that no-one is expecting the globals
>>>>> to have distinct addressses).  As EH type infos shouldn't get
>>>>> unnamed_addr, that should mean that the EH issue isn't a problem any
>>>>> more.
>>>>> 
>>>>> Ciao, Duncan.
>>>>> 
>>>>> You will find enclosed a patch that adds the support of constant variables
>>>>> for
>>>>> the generic global merge pass.
>>>>> I've also attached a motivating example, test.c.
>>>>> 
>>>>> In that file there are several global variables accessed within the same
>>>>> function.
>>>>> If all the variables are non-const, global merge generates one unique
>>>>> variable,
>>>>> which translates into one unique load in the assembly file.
>>>>> 
>>>>> clang -arch arm -O3 test.c -o - -S
>>>>> 
>>>>> If some are const (the example is very stupid), global merge does not merge
>>>>> anything and one load for each variable is issued, i.e., 3
>>>>> clang -arch arm -O3 test.c -o - -S -DWITHCONST
>>>>> 
>>>>> If global merge had merged the constant globals, only 2 loads would have
>>>>> been
>>>>> issued. The patch allows to do that.
>>>>> 
>>>>> Although the patch is fairly simple, it was not done before because it seems
>>>>> it
>>>>> was breaking the EH processing (see comment in the original code).
>>>>> Is it still true?
>>>>> 
>>>>> If yes, how could I reproduce that?
>>>>> 
>>>>> Note: that the patch disables by default the handling of global constants
>>>>> because the current heuristic may generate worse code. A tuning may be
>>>>> necessary, but it would be done as a second step
>>>>> 
>>>>> -Quentin
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> llvm-commits mailing list
>>>>> llvm-commits at cs.uiuc.edu
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> llvm-commits mailing list
>>>>> llvm-commits at cs.uiuc.edu
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> llvm-commits mailing list
>>>>> llvm-commits at cs.uiuc.edu
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> llvm-commits mailing list
>>>>> llvm-commits at cs.uiuc.edu
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>> 
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> llvm-commits mailing list
>>>>> llvm-commits at cs.uiuc.edu <mailto:llvm-commits at cs.uiuc.edu>
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> With best regards, Anton Korobeynikov
>>>> Faculty of Mathematics and Mechanics, Saint Petersburg State University
> 
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130313/3e0e825c/attachment.html>


More information about the llvm-commits mailing list