[PATCH] Global merge on constants
Quentin Colombet
qcolombet at apple.com
Wed Mar 13 11:26:23 PDT 2013
Hi David,
I was looking at Duncan’s proposal regarding the creation of GlobalAlias to GEP with non zero offset for merged global variables.
Based on Duncan’s comment, I’ve been able to find a PR that you initiated on that:
http://www.llvm.org/bugs/show_bug.cgi?id=4739
The resolution of this PR was INVALID.
If I understand correctly this PR, it is not possible to create aliases to the middle of a structure for Mach-O target.
What is the current status for other targets?
Also, what do you think would be a viable approach for the global merge problem?
Thanks,
-Quentin
On Mar 12, 2013, at 10:16 AM, Quentin Colombet <qcolombet at apple.com> wrote:
> Hi Duncan,
>
> Thanks for the insightful information!
>
> If I've understood correctly what you’re saying, the current process of the global merge pass is wrong since it may merge globals referenced in inline assembly statement.
>
> If yes, we should file a PR!
>
> I like you’re idea with aliases.
>
> Is there a way to know if aliases and this kind of aliases are supported on the current platforms?
>
> Assuming I can use aliases to achieve a valid global merging, it looks like it is incompatible with the spirit of http://llvm.org/bugs/show_bug.cgi?id=10367.
>
> What do you think?
>
> Thanks,
>
> -Quentin
>
> On Mar 12, 2013, at 1:07 AM, Duncan Sands <baldrick at free.fr> wrote:
>
>> Hi Quentin,
>>
>>> Sorry for the late reply, I was checking the information Anton’s pointed out and
>>> now, I need your opinions on what I saw.
>>>
>>> So, according to the documentation, TypeInfos and more specifically the
>>> arguments of the clause part in landing pad instructions are supposed to be
>>> GlobalVariable.
>>
>> I'm pretty sure it needs to be a GlobalVariable or a bitcast of one. In short
>> something which turns into a simple label in the assembler.
>>
>>>
>>> Now, if you look at the code in the backend, the Verifier pass checks that the
>>> argument of the catch clause is Constant (which is a superclass of
>>> GlobalVariable and GEP) and that the type is a pointer type
>>> (Verifier::visitLandingPadInst).
>>
>> At the IR level it doesn't matter much what a typeinfo is. The restrictions are
>> coming from codegen.
>>
>>>
>>> Note: The type doesn’t change when the variable is merged (otherwise there is
>>> something wrong!).
>>>
>>> Similarly, LLParser (LLParser::ParseLandingPad) and BitcodeReader
>>> (BitcodeReader::ParseFunctionBody) do not rely on GlobalVariable, they just
>>> check that the type is not an ArrayType.
>>>
>>> Then, for the type info properly speaking, MachineModuleInfo uses GlobalVariable
>>> to map the type to an id (MachineModuleInfo::getTypeIDFor related) and
>>> AsmPrinter uses their name for printing.
>>>
>>> As far as I can tell, it feels like it is no more a problem to merge the global
>>> constants (i.e., potentially changing a GlobalVariable into a GEP), even if they
>>> are related to EH. In particular, the type info is currently built using
>>> llvm::ExtractTypeInfo, which already takes a Value and do a stripPointerCast
>>> that gets a global variable even from the GEP and GEP is already a subclass of
>>> Constant like GlobalVariable.
>>>
>>> Therefore, do you think it is still a problem to merge global constant?
>>
>> Yes, as I mentioned above I think the typeinfo needs to be a GlobalVariable
>> (or a bitcast of one) not an offset into one (GEP). See below for some thoughts
>> on how to relax this.
>>
>>>
>>> One last thing, I don’t know if .ll file can hit that, since I don’t know if
>>> clang is doing much work on then, but clang also expects GlobalVariable for
>>> landing pad instruction in one very specific case for ObjC++ exception
>>> (PersonalityHasOnlyCXXUses). From my tests, ll files don’t get into that path
>>> but maybe I’m missing something.
>>> What do you think?
>>>
>>> Thanks for your help,
>>>
>>> -Quentin
>>> PS: dragon egg seems fine with Constant for clauses in landing pad instruction too!
>>
>> Front-ends know what they produce, they aren't really relevant here. Dragonegg
>> produces (bitcasts of) GlobalVariables. Using Constant is just a convenient way
>> of handling both GlobalVariables and bitcasts of global variables in a uniform
>> way.
>>
>> Note that it isn't only exception handling typeinfos that is an issue here:
>> people using a GlobalVariable G in inline asm may well refer to it by its
>> name "G" in the inline asm. Merging globals could break this.
>>
>> Here is a possible solution to all these issues:
>>
>> (1) Enhance aliases so that they can be offsets into a global, not just an alias
>> to a global (there is a PR requesting this somewhere);
>> (2) When merging a global G into a mega-global, turn "G" into an alias pointing
>> to the position in the mega global.
>>
>> Eg:
>>
>> Before:
>>
>> @F = global...
>> @G = global...
>> @H = global...
>>
>> After:
>>
>> @Merged = global ... contains F, G then H
>> @F = alias bitcast @Merged to ...
>> @G = alias getelementptr @Merged, ...
>> @H = alias getelementptr @Merged, ...
>>
>> This should then result in F, G and H turning up as labels in the assembler,
>> usable as typeinfos. It would likewise solve the inline asm issue.
>>
>> The problems are: I'm not sure aliases are supported on all platforms, and
>> I'm even less sure which platforms support this kind of alias.
>>
>> Ciao, Duncan.
>>
>>
>>>
>>> On Mar 1, 2013, at 10:53 AM, Anton Korobeynikov <anton at korobeynikov.info
>>> <mailto:anton at korobeynikov.info>> wrote:
>>>
>>>> Hi Quentin,
>>>>
>>>> Indeed, GlobalMerge did not merge global constants because EH
>>>> processing code expected typeinfo and other stuff to be globals, not
>>>> gep's.
>>>>
>>>> The situation changed with addition of landingpad instruction. You
>>>> might want to check how it handle GEPs for typeinfos (though, I'd say
>>>> that LangRef states that typeinfo is a global).
>>>>
>>>>
>>>> On Fri, Mar 1, 2013 at 10:05 PM, Quentin Colombet <qcolombet at apple.com
>>>> <mailto:qcolombet at apple.com>> wrote:
>>>>> Hi Duncan,
>>>>>
>>>>> I've checked in LLVM documentation and the unnamed_addr marker is not
>>>>> specific to global variables that are constant. Hence, based on your
>>>>> comment, I was thinking that the global merge pass may perform illegal
>>>>> transformations.
>>>>> Actually, I thought about it and from my understanding, merging two globals
>>>>> like global merge pass does is not incompatible with having two distinct
>>>>> addresses.
>>>>> Indeed, global merge keeps all globals but wraps them into a structure.
>>>>> E.g.,
>>>>> int a;
>>>>> int b;
>>>>> =>
>>>>> struct {
>>>>> int a;
>>>>> int b;
>>>>> } MergedGV;
>>>>>
>>>>> Thus, &MergedGV.a != &MergedGV.b
>>>>>
>>>>> This brings me back to my initial question, does gathering constants into
>>>>> structures still cause a problem with EH since it does not seem incompatible
>>>>> with unnamed_addr?
>>>>>
>>>>> Note that I've run the test case that was triggering the failure, and it
>>>>> works with my patch enabled. But since I don't know how EH are handled I
>>>>> prefer to have an external opinion.
>>>>>
>>>>> Reference of the test case triggering the failure:
>>>>> http://llvm.org/bugs/show_bug.cgi?id=7716
>>>>>
>>>>> Thanks,
>>>>>
>>>>> -Quentin
>>>>>
>>>>> On Feb 28, 2013, at 10:08 AM, Quentin Colombet <qcolombet at apple.com> wrote:
>>>>>
>>>>> Ping?
>>>>>
>>>>> -Quentin
>>>>>
>>>>> On Feb 27, 2013, at 9:55 AM, Quentin Colombet <qcolombet at apple.com> wrote:
>>>>>
>>>>> Hi Duncan,
>>>>>
>>>>> Does it mean that the current behavior of global merge is wrong or is this
>>>>> unnamed_addr marker apply only to global constants?
>>>>>
>>>>> Anyway, thanks for the information.
>>>>>
>>>>> -Quentin
>>>>>
>>>>> On Feb 27, 2013, at 12:50 AM, Duncan Sands <baldrick at free.fr> wrote:
>>>>>
>>>>> Hi Quentin, you should only be merging them if they are marked with
>>>>> unnamed_addr (this indicates that no-one is expecting the globals
>>>>> to have distinct addressses). As EH type infos shouldn't get
>>>>> unnamed_addr, that should mean that the EH issue isn't a problem any
>>>>> more.
>>>>>
>>>>> Ciao, Duncan.
>>>>>
>>>>> You will find enclosed a patch that adds the support of constant variables
>>>>> for
>>>>> the generic global merge pass.
>>>>> I've also attached a motivating example, test.c.
>>>>>
>>>>> In that file there are several global variables accessed within the same
>>>>> function.
>>>>> If all the variables are non-const, global merge generates one unique
>>>>> variable,
>>>>> which translates into one unique load in the assembly file.
>>>>>
>>>>> clang -arch arm -O3 test.c -o - -S
>>>>>
>>>>> If some are const (the example is very stupid), global merge does not merge
>>>>> anything and one load for each variable is issued, i.e., 3
>>>>> clang -arch arm -O3 test.c -o - -S -DWITHCONST
>>>>>
>>>>> If global merge had merged the constant globals, only 2 loads would have
>>>>> been
>>>>> issued. The patch allows to do that.
>>>>>
>>>>> Although the patch is fairly simple, it was not done before because it seems
>>>>> it
>>>>> was breaking the EH processing (see comment in the original code).
>>>>> Is it still true?
>>>>>
>>>>> If yes, how could I reproduce that?
>>>>>
>>>>> Note: that the patch disables by default the handling of global constants
>>>>> because the current heuristic may generate worse code. A tuning may be
>>>>> necessary, but it would be done as a second step
>>>>>
>>>>> -Quentin
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> llvm-commits mailing list
>>>>> llvm-commits at cs.uiuc.edu
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> llvm-commits mailing list
>>>>> llvm-commits at cs.uiuc.edu
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> llvm-commits mailing list
>>>>> llvm-commits at cs.uiuc.edu
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> llvm-commits mailing list
>>>>> llvm-commits at cs.uiuc.edu
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> llvm-commits mailing list
>>>>> llvm-commits at cs.uiuc.edu <mailto:llvm-commits at cs.uiuc.edu>
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> With best regards, Anton Korobeynikov
>>>> Faculty of Mathematics and Mechanics, Saint Petersburg State University
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130313/3e0e825c/attachment.html>
More information about the llvm-commits
mailing list